0% found this document useful (0 votes)
34 views16 pages

Graph Neural Networks Approach For Joint Wireless Power Control and Spectrum Allocation

Uploaded by

sneepweep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views16 pages

Graph Neural Networks Approach For Joint Wireless Power Control and Spectrum Allocation

Uploaded by

sneepweep
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Received 1 December 2023; revised 19 March 2024 and 22 May 2024; accepted 27 May 2024.

Date of publication 3 June 2024; date of current version 10 June 2024.


The associate editor coordinating the review of this article and approving it for publication was A. Savard.
Digital Object Identifier 10.1109/TMLCN.2024.3408723

Graph Neural Networks Approach for


Joint Wireless Power Control and
Spectrum Allocation
MAHER MARWANI 1 (Student Member, IEEE), AND
GEORGES KADDOUM 1,2 (Senior Member, IEEE)
1 Electrical Engineering Department, École de Technologie Supérieure, Montreal, QC H3C 1K3, Canada
2 Artificial Intelligence and Cyber Systems Research Center, Department of Computer Science and Mathematics,
Lebanese American University, Beirut 797751, Lebanon
CORRESPONDING AUTHOR: M. MARWANI ([email protected])

ABSTRACT The proliferation of wireless technologies and the escalating performance requirements
of wireless applications have led to diverse and dynamic wireless environments, presenting formidable
challenges to existing radio resource management (RRM) frameworks. Researchers have proposed utilizing
deep learning (DL) models to address these challenges to learn patterns from wireless data and leverage the
extracted information to resolve multiple RRM tasks, such as channel allocation and power control. However,
it is noteworthy that the majority of existing DL architectures are designed to operate on Euclidean data,
thereby disregarding a substantial amount of information about the topological structure of wireless networks.
As a result, the performance of DL models may be suboptimal when applied to wireless environments
due to the failure to capture the network’s non-Euclidean geometry. This study presents a novel approach
to address the challenge of power control and spectrum allocation in an N-link interference environment
with shared channels, utilizing a graph neural network (GNN) based framework. In this type of wireless
environment, the available bandwidth can be divided into blocks, offering greater flexibility in allocating
bandwidth to communication links, but also requiring effective management of interference. One potential
solution to mitigate the impact of interference is to control the transmission power of each link while ensuring
the network’s data rate performance. Therefore, the power control and spectrum allocation problems are
inherently coupled and should be solved jointly. The proposed GNN-based framework presents a promising
avenue for tackling this complex challenge. Our experimental results demonstrate that our proposed approach
yields significant improvements compared to other existing methods in terms of convergence, generalization,
performance, and robustness, particularly in the context of an imperfect channel.

INDEX TERMS Intelligent resource allocation, RRM, 6G, GNN, D2D, AI.

I. INTRODUCTION demands [5], [6]. Previous RRM solutions are insufficient

T HE sixth generation (6G) of wireless communications


is expected to feature heterogeneous networks capable
of supporting a vast number of connected devices while
in adapting to the novel heterogeneous wireless environment
in terms of convergence time, generalization to different
wireless contexts, and maintaining satisfactory performance
delivering high data rates, low latencies, and energy effi- while scaling up the number of devices. Therefore, novel
ciency. Several technologies have been developed to meet approaches are required to address these challenges and pave
these requirements, including those referenced in [1], [2], [3], the way for the efficient management of wireless resources in
and [4]. However, the increasing complexity of radio resource the upcoming 6G era.
management (RRM) has emerged as a significant challenge The focus of our research is on the N-link interference
with the proliferation of new technologies and diverse environment with shared channels, a wireless network

2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 2, 2024 717
architecture characterized by multiple communication links promising results in various domains, indicating its potential
sharing the same available bandwidth. In this setting, the as an effective technique for enhancing the performance of
co-existence of multiple links causes significant interference RRM algorithms in wireless networks.
and performance degradation, which necessitate effective The primary aim of this research is to propose a solution
management of transmission power control and spectrum that simultaneously addresses spectrum allocation and power
allocation. This network structure can be observed in control tasks. Initially, we formulate a network mean rate
various wireless scenarios, including device-to-device (D2D) maximization problem, considering both RRM tasks and
communication [7], [8], [9], [10], where multiple devices the minimum Quality of Service (QoS) required for each
communicate directly without a network infrastructure, and communication link. Subsequently, we create interference
uplink/downlink [11], [12], [13], [14], [15] scenarios, where a graphs from the network’s CSI, enabling parallel processing
base station communicates with multiple users using the same without information loss. Additionally, we develop an end-
spectrum, also known as non-orthogonal multiple access to-end GNN-based framework that learns from these con-
(NOMA). To maximize the network’s sum rate, solving the structed graphs and embeds them into Euclidean space. This
mixed-integer, non-convex optimization problem involving embedding is used to compute power and channel allocation
power control and channel allocation is essential. However, solutions. In contrast to Deep Neural Network (DNN)
obtaining a globally optimal solution within the required time models, our framework is both scalable and generalizable,
is challenging. Therefore, researchers have proposed several requiring no retraining or architectural modification when
near-optimal solutions for specific cases [16], [17], [18], [19], changing the input size. It also excels in computational
which tend to have high computational complexity and are efficiency due to parallel execution. To enhance the model’s
impractical for real-time scenarios. generalization and training stability, we combine four loss
In recent years, researchers have explored the use of functions: the supervised mean squared error for power
machine learning (ML) techniques to address wireless control, the supervised cross-entropy for channel allocation,
network optimization problems. Specifically, there has been an unsupervised loss to avoid constraining the model with
interest in incorporating deep learning (DL) approaches, an upper-bound performance from supervised training, and
which have shown promise in a variety of applications. Two a regulation loss to ensure QoS constraints are met. Lastly,
primary approaches have been pursued in this integration: (1) we rigorously tested our approach, focusing on the training
constructing end-to-end learnable architectures that can cap- convergence, generalization across different wireless setups,
ture complex relationships between inputs and outputs [20], network mean rate, QoS violation, scalability with input
[21], [22], and (2) replacing computational blocks within size, and robustness in the presence of imperfect channel
existing solutions with DL architectures to reduce compu- estimation.
tational costs [23], [24]. Despite promising results, existing This paper is structured as follows. In Section II,
DL-based approaches have primarily focused on addressing a comprehensive literature review is presented to explore
isolated RRM tasks such as power control, user association, the previous work related to our research. In Section III,
and link scheduling. Moreover, their scalability to large we introduce the N-link interference environment with shared
wireless networks is a concern as they scale linearly with channels, which is the problem setting that our proposed
respect to the size of the input data. Furthermore, techniques solution is designed to address. In Section IV, we present
such as multi-layer perceptrons (MLPs) and convolutional the optimization problem that we aim to solve. In Section V,
neural networks (CNNs) can be subjected to overfitting and, we provide a detailed description of our proposed end-to-end
thus, require large amounts of training data. Additionally, solution architecture, which consists of various components,
these methods rely on tabular data, such as channel state including CSI preprocessing, the GNN feature extractor, the
information (CSI), which ignores the network’s underlying MLP component, the loss function design, and the training
topology. Therefore, there is a need for further research to process. In Section VI, we conduct extensive simulations
explore more effective ways to integrate DL approaches into to evaluate our proposed solution’s performance in terms
wireless network optimization problems. Recent research has of stability, generalization, and robustness compared to the
demonstrated the potential for improving the scalability and state-of-the-art methods. Finally, in Section 7, we present our
generalization of DL-based RRM solutions by integrating conclusions and future research directions.
the target task’s structure into the neural network architec-
ture [25]. Given that wireless networks can be intuitively II. LITERATURE REVIEW
modeled as graph topologies, there is a growing interest Numerous studies have focused on solving the power
in leveraging graph representation learning techniques to control and spectrum allocation problems in different
enhance the performance of RRM algorithms [26]. One network topologies, either independently or concurrently.
such approach is Graph Neural Networks (GNNs), which For example, in [16], the authors proposed a dual-based
possess several attractive properties, including permutation iterative algorithm that allocates resources to D2D pairs while
equivariance, scalability, generalization, high computational maintaining the quality of service requirements. Another
efficiency, and the ability to train efficiently on relatively study [19] developed a two-stage algorithm to maximize
small datasets [27]. The application of GNNs has yielded the energy efficiency of D2D communication under cellular

718 VOLUME 2, 2024


Marwani and Kaddoum: GNNs Approach for Joint Wireless Power Control and Spectrum Allocation

constraints, assuming that each D2D link could use one sub- To facilitate the incorporation of input data structures into
channel at most to decrease the computational complexity. DL models for RRM tasks, several solutions based on GNNs
Conversely, this research [17] proposed a channel and power have been introduced [26]. These approaches have exhibited
allocation scheme with channel reuse based on the Hungarian promise in tackling various RRM tasks, encompassing
algorithm and a prioritizing method. Moreover, this work [18] channel allocation, power control, and user association.
employed a game-theory approach to manage the reuse of For instance, in [32], a framework combining Deep
multiple channels by multiple D2D pairs. Despite the various Reinforcement Learning (DRL) and Graph Convolution
proposed solutions, most of them were heuristic approaches Networks (GCNs) was proposed for channel allocation.
or tended to convexify the RRM problems, resulting in This method enabled the agent to learn optimal channel
a high computational complexity. Additionally, they did assignments to access points using features extracted from
not provide complete flexibility in allocating multiple the wireless environment as a state space. However, the
channels to multiple links, mainly because of convergence model’s testing was confined to a wireless setting with perfect
issues. channels and a relatively small number of devices. Similarly,
Given the limitations of model-based and heuristic solu- [33] introduced a GNN-based framework for learning
tions, researchers have turned to learnable approaches by resource allocation strategies in wireless networks, offering
integrating DL architecture to tackle RRM problems. For reduced training times and improved scalability compared
example, these works [21], [28], [29] integrated a DL to conventional MLPs. Nonetheless, this framework was
component to learn the optimal pruning policy for the not well-suited for heterogeneous wireless devices or
branch-and-bound (B&B) algorithm to solve mixed-integer systems with single or multiple antennas. To address these
nonlinear programming (MINLP) problems. While this limitations, [25] presented a more flexible GNN-based
approach simplifies the problem significantly and reduces the solution for constrained power allocation in a heterogeneous
exponential computation of the traditional B&B algorithm, MIMO-interfering environment. Leveraging the permutation-
intense sampling is required to train the DL architecture invariant properties of RRM problems, this GNN architecture
since the training is supervised. To alleviate the need for demonstrated excellent generalization across different prob-
training data, unsupervised learning approaches have been lem scales with minimal training data. In addition, in [34],
explored [30]. This work considers constructing a DNN researchers aimed to find the optimal power control strategy
framework to solve beamforming problems over an imperfect in an uplink multi-cell network by combining DNNs with
channel, which is trained in an unsupervised fashion using knowledge from the wireless network’s topology, reducing
the negative of the sum rate. Similarly, [22] constructed training complexity and model parameters. However, this
a DNN that takes CSI and computes the power of each approach was tailored exclusively to the power control
user. However, this work did not integrate the minimum rate task. In contrast, [35] proposed employing GNNs to tackle
constraints into the training process, which raises questions power control and beamforming issues in heterogeneous
about the solution’s feasibility. Another approach to training D2D networks. Here, communication and interference links
DL models is combining supervised and unsupervised losses. were represented as vertices in the wireless graph, and
For instance, [31] proposed an end-to-end DL framework to an unsupervised learning process was employed for the
solve resource allocation in multi-channel cellular systems graph convolutional model. This method demonstrated
with D2D links. Moreover, the approach can be implemented favorable properties such as scalability and reduced execution
in a centralized manner, with full knowledge of the CSI, time compared to alternative approaches. Similarly, [36]
or distributed manner with partial CSI. However, the authors introduced an Access Point (AP) selection strategy for
transform the continuous power variable into a set of discrete massive cell-free Multiple Input, Multiple Output (MIMO)
levels in order to use the cross-entropy loss. Following the systems based on GNNs. The authors constructed two graphs:
same training approach in [24], a CNN model is employed to a homogeneous one representing only AP nodes and a
learn the patterns from CSI and output the power control that heterogeneous one containing both user equipments and
maximizes the energy or spectrum efficiency of the network. AP nodes. However, these methods modeled the wireless
Despite the promising results achieved by the current DNN network as a single graph, assuming that all communication
and CNN approaches, their lack of flexibility with input links interfered with each other. GNNs can serve as
sizes is a significant limitation. Any alteration in input shape end-to-end learnable solutions or feature extractors. For
necessitates architectural modification. Furthermore, they example, [37] proposed a joint optimization framework
prove inadequate in large wireless scenarios with a substantial for user association and power control in a heterogeneous
number of connected devices. This deficiency stems from ultra-dense network. Similarly, [38] improved the Itera-
their heavy reliance on the quality of training data, which tively Weighted Minimum Mean Square Error (WMMSE)
can be challenging to obtain in real-life situations. Moreover, algorithm [39] by incorporating trainable components
the training process for these models is often time-consuming parametrized by GNNs. Simulations illustrated that the pro-
and typically conducted offline. Another drawback of DNN posed method, unfolded WMMSE, delivered a comparable
and CNN approaches is their disregard for the geometric performance to WMMSE but with significantly lower time
information inherent in the input data. complexity.
VOLUME 2, 2024 719
The work in [40] introduced a trainable resilient RRM
policy using an unsupervised primal-dual approach for power
control and user association. Another paper [41] presented
an edge-update empowered GNN architecture, enhancing
GNNs’ ability to handle node and edge variables and
validating its Permutation Equivariance in power allocation
scenarios. Additionally, [42] introduced Aggregation GNNs
for decentralized resource allocation in wireless networks,
utilizing a model-free primal-dual approach for asynchronous
local information processing. The study in [43] proposed
a distributed spectrum allocation scheme for vehicle-to-
everything (V2X) networks using GNNs and multi-agent RL
to optimize the network capacity. Furthermore, [44] discussed
GNN-based frameworks for distributed power allocation in
wireless networks, aimed at minimizing signaling overhead
by incorporating Recurrent Neural Networks (RNNs) to
capture temporal dynamics. The work in [45] offered a FIGURE 1. N-link interference environment with shared channels.
GNN framework to enhance power control and hybrid
precoding in wireless systems, demonstrating scalability and
efficiency. The work in [46] proposed a state-augmented independent resource allocation for each frame. We introduce
algorithm for RRM in multi-user networks, ensuring feasible gkii ∈ R to represent the direct channel gain between the
and nearly optimal decisions. In addition, [47] introduced transmitter and receiver of the k-th RB in the i-th link, and
a Heterogeneous GNN model for resource allocation in gkij ∈ R to denote the channel gain between the transmitter of
heterogeneous networks. The study in [48] presented a the j-th link and the receiver of the i-th link. To represent all
GNN-based scheme for RRM in wireless IoT networks, channel gains, we define G = [G1 , . . . , GK ] ∈ RN ×N ×K as
optimizing resources in D2D communications. Lastly, [49] the CSI tensor of all links in all resource blocks, where:
explored the expressive power of GNNs in learning wireless  k
g11 gk12 · · · gk1N

policies, highlighting the limitations of Vertex-GNNs and the  gk gk22 · · · gk2N 
 21
advantages of Edge-GNNs in resource allocation tasks. Gk =  . .. .. ..  . (1)

Nevertheless, most of the mentioned GNN-based works  .. . . . 
are not adaptable to environments where multiple resource gkN 1 gkN 2 ··· gkNN
blocks have varying CSI. Additionally, many of these
methods focus on addressing individual RRM tasks. Taking into consideration the most common types of fading,
the channel gain formula can be expressed as:

III. SYSTEM MODEL gkij = βijk αijk |hkij |2 ∀k ∈ K, ∀(i, j) ∈ N × N . (2)


We denote N = {1, 2, . . . , N } a set of active (scheduled)
links distributed randomly in a two-dimensional environ- where βijk is the path loss proportional to the inverse of
ment. The distance between transmitter-receiver pairs varies the distance, αijk is the shadowing following the normal
across links. We adopt a non-orthogonal scheme for all distribution, and hkij represents the small scale Rayleigh
communication links, where K = {1, 2, . . . , K } is the set fading.
of resource blocks (RBs) with constant bandwidth W that Each transmitter is equipped with a single antenna, and
can be assigned to any link (K ≤ N ). In this environment, we represent the power allocation for all links as P =
a centralized control unit (CCU) controls the transmission [p1 , p2 , · · · , pN ], where pi is the transmission power of the
power of each link and allocates the required bandwidth i-th link. We also consider a maximum transmission power
to ensure effective communication, as illustrated in Fig. 1. limit, denoted as Pmax , i.e., pi ≤ pmax . For RB assignment,
We operate in a time slot scenario where the CCU obtains we use binary variables ψik ∈ {0, 1}, where ψik = 1 indicates
CSI from scheduled links, performs resource management, that the i-th link uses the k-th RB, and ψik = 0 otherwise.
and communicates decisions to all transmitters. Although the We denote the RB assignment for the i-th link as 9i =
CCU possesses full CSI knowledge, it may still be subject to [ψi1 , · · · , ψiK ]. We assume that each link can use at most one
RB per time slot, i.e., K k=1 ψi ≤ 1.
k
P
noise and errors, leading us to evaluate our approach under
noisy CSI conditions to assess its robustness in Section V. In our analysis, we focus on the dedicated mode, where
We assume that the bandwidth W of each RB is links experience no interference from other users. This
small enough to exhibit flat-fading channel characteristics. means external interference is not considered in our calcula-
Additionally, due to block fading, CSI values change tions. We evaluate the signal-to-interference-plus-noise ratio
independently from one time slot to the next, requiring (SINR) for the k-th RB in the i-th receiver, which is defined

720 VOLUME 2, 2024


Marwani and Kaddoum: GNNs Approach for Joint Wireless Power Control and Spectrum Allocation

as: learning process and an unsupervised phase that maximizes


gkii pi our objective function while also mitigating overfitting.
SINRki = PN , ∀(i, k) ∈ N × K. (3)
j̸=i ψj gij pj
k k + N0 W
where N0 is the noise density per unit bandwidth. Conse-
quently, using the defined variables, we can calculate each
link’s achievable rate as:
XK  
γi = W ψik log2 1 + SINRki , ∀i ∈ N . (4)
k=1
Our objective is to find values for the variables P and 9 that
maximize the average network rate while ensuring quality of
service, power constraints, and bandwidth limitations. This
leads us to formulate the optimization problem as follows:
N
1 X
max γi
P,9 N
i=1
s.t γi ≥ γmin ∀i ∈ N
K
X
ψik ≤ 1 ∀i ∈ N
k=1
0 ≤ pi ≤ pmax ∀i ∈ N
ψik ∈ {0, 1} ∀(k, i) ∈ K × N (5)
In (5), the first constraint ensures a minimum required data
rate is met, the second constraint limits each link to using
only one channel, and the third and fourth constraints restrict
transmission powers within the defined maximum power and
enforce binary channel indicators.
The problem at hand presents significant challenges due
to the complexity of the objective function and the inclusion
of mixed variables in the optimization process. Furthermore,
the presence of time constraints, specifically related to the
FIGURE 2. Model architecture.
channel state, necessitates the adoption of a solver with a
convergence time that is shorter than the coherence time to
ensure the validity of the obtained solution. As such, we seek A. CHANNEL STATE INFORMATION PREPROCESSING
a universal approach capable of producing efficient solutions In this subsection, we describe the process followed to
within the necessary timeframe while adhering to the transform the CSI tensor G into multiple graph structures
imposed constraints. In the subsequent section, we introduce to gain insights into their geometric properties. Our system
our novel GNN-based model, which offers a generalizable model operates under specific constraints where each link
solution and yields promising outcomes, thereby fulfilling the is allocated a maximum of one RB, leading to potential
aforementioned objectives. interference only when multiple links share the same RB.
Consequently, RBs are considered independent regarding
IV. SOLUTION ARCHITECTURE interference, which simplifies the management of interac-
In this section, we present a detailed exposition of our tions between links. To effectively represent the CSI within
proposed model, including a description of the training pro- this framework, we construct interference graphs for each
cess. Fig. 2 provides an overview of the model architecture. RB, depicted as distinct subgraphs. This approach mirrors the
Essentially, the model takes the processed CSI, which has uncorrelated flat fading characteristic of our system, where no
been transformed into separate interference graphs, as input correlation exists between different RBs. Since interference
and produces power levels and channel matrices as output. arises only when multiple links utilize the same bandwidth,
The model consists of two primary components: a feature we construct K separate complete graphs (N nodes and
extractor based on the GNNs and a CNN block that learns N (N − 1)/2 edges) without any loss of information, where
from the embedding vectors while simultaneously preserving K is the number of resource blocks. Specifically, we denote
the constraints inherent in the problem. The training process Gk (V, E) as the interference graph of the k-th RB, where
comprises two phases: a supervised phase that enhances the the nodes denote the communication links and the edges

VOLUME 2, 2024 721


represent the interference links. Each node is labeled by via the message function φ. Subsequently, within each node v,
ni = Gkii ∀i ∈ N , indicating the signal strength or messages from all neighboring nodes in N (v) are aggregated
quality at each link, and each edge is labeled by eij = using the reduce function ρ. Finally, the node feature zt+1 v
Gkij ∀(i, j) ∈ N × N , representing the interference between is updated using the update function σ . Different choices
links. This representation allows for a simplified yet effective for combining and aggregation functions can lead to various
understanding of the interactions and interference patterns types of GNNs [52], but the reduce function must always be a
within the network, leveraging the geometric properties of permutation-invariant operation, such as sum, mean, or max,
the graphs to facilitate analysis and optimization. Fig. 3 to ensure that the input graph’s global structure is preserved.
illustrates an example of a three-interference graph for three Following a similar pattern, our computation equation (6) is
communication links. defined as:

vu = MLP1 (zu , guv , gvu )


mt+1 t t

zt+1
v = MLPt2 (ztv , max{mt+1
vu , u ∈ N (v))}) (7)

In this equation, the message function is represented as


an MLP block, named MLP1 , which consists of layers
of neurons, non-linear activation functions, and batch nor-
malization. It operates on the concatenation of node u’s
hidden state and the features of edge (v, u). The reduce
function is essentially a max aggregation operation, which
combines messages received from all neighbors of node v.
The update function for node v is similar to MLP1 but with
a distinct number of neurons and is referred to as MLP2 .
FIGURE 3. Interference graphs of three communication links It learns patterns from the aggregation of messages mt+1 vu
using three resource blocks. from neighboring nodes and the node’s previous hidden state.
Notably, as the size of the hidden state ztv of each node varies
By modeling the CSI in this way, we enhance computa- for each iteration t due to the concatenation operations, each
tional efficiency without sacrificing any essential informa- message-passing iteration has its specific message and update
tion. This graph formulation enables us to parallelize the functions denoted as MLPt1 and MLPt2 .
computation of our model, thereby reducing the model’s MLPs are preferred in our context for stability, efficiency,
training and execution time by approximately 1/K . Addition- and performance. CNNs are less suitable due to the lack
ally, handling relatively small graphs is more manageable in of spatial correlations in message function inputs in the
terms of complexity and memory control. message function input (ztu , guv , gvu ), and RNNs are limited
by the absence of temporal patterns. Upon completing T
B. GNN FEATURE EXTRACTOR
iterations, the output of the GNN feature extractor comprises
GNNs are a specialized type of neural network designed to the embedding vectors for each node within each graph,
operate on graph-structured data [50]. They share a multi- represented as zTik ∈ RD ; i ∈ V k , where V k denotes the
layer structure akin to DNNs, where each node within the node set of Gk . These vectors are then stacked to form a
graph combines its individual features with an aggregation global embedding tensor Z ∈ RN ×K ×D , where D denotes the
of the features of its neighboring nodes. Furthermore, embedding dimension. Subsequently, the tensor Z is used for
GNNs update the embedding of each node through itera- computing the power vector and the channel allocation matrix
tive aggregation and combination operations. This iterative in the next steps.
process relies on a message-passing mechanism [51], where
information is exchanged among nodes in the graph through
C. CNN COMPONENT
their connecting edges to capture relationships between
In this section, we detail the CNN component of our
nodes. In essence, the t-th GNN layer for a node v ∈
approach, starting with the input of the embedding tensor
V can be succinctly summarized through two key iterative
Z into a CNN block. This block executes a deconvolution
equations (6):
operation along the embedding axis, outputting a matrix with
vu = φ(zv , wvu , zu )
mt+1 t t t dimensions N × K . The CNN block consists of convolutional
= σ (ztv , ρ({mt+1 layers that extract features by progressively reducing the
zt+1
v vu , u ∈ N (v)})) (6)
channel count, alongside ReLU activation functions, dropout
Here, ztv ∈ Rd1 represents the hidden state of node v ∈ V, and for regularization, and batch normalization for stability. The
wtvu ∈ Rd2 denotes the feature vector associated with the edge result, denoted as X = CNNd (Z ), benefits from the CNN’s
(v, u) ∈ E at time t. In the (t +1)-th iteration, the edge features ability to identify spatial patterns, augmenting the GNN’s
wtvu are fused with the features of their incident nodes {ztu , ztv } geometric insights.

722 VOLUME 2, 2024


Marwani and Kaddoum: GNNs Approach for Joint Wireless Power Control and Spectrum Allocation

We apply dimension manipulation techniques to process from the diverse array of solutions within our dataset
X for our desired outputs—power vector and a Resource generated by PYMOO. This dataset, rich in variety due
Block (RB) matrix. We treat X as a matrix for channel to alterations in minimum data rates, the number of links,
allocation and use a softmax function across the relevant and link distance variations, equips the model with a wide-
dimension to calculate channel probabilities. For power ranging understanding of potential wireless configurations.
control, we condense X along the K -axis into an N × Such a broad perspective is crucial for the model’s ability to
1 vector and apply a sigmoid function to normalize the power effectively adapt and fine-tune to specific scenarios during the
values to a range between 0 and 1. We employ a softmax inference phase. Following the supervised learning stage, the
operation along the channel axis to compute the probabilities networks further refine their ability to optimize the objective
of selecting a channel. These probabilities are calculated as: through unsupervised learning, all the while adhering to
exik optimization constraints enforced by the regulation loss.
aki = PK (8) This holistic approach ensures that the model not only
xik
s=1 e learns generalizable strategies but also enhances its objective
Consequently, the channel allocation is calculated as follows: maximization capabilities and compliance with necessary
( constraints.
1 if j = argmaxk ({aki , k ∈ K}) Specifically, power control is a supervised continuous
9i =
k
(9)
0 otherwise prediction problem; thus, we employ the mean square error
(MSE) to determine the prediction’s cost function. Moreover,
The current formulation of 9 is not differentiable and would
we consider the channel allocation as a multi-label supervised
break the chain rule of backpropagation. Therefore, during
classification problem. Thus, we use the categorical cross-
deployment, we utilize this formulation to ensure that the
entropy (CCE) loss to calculate the cost of the sample’s
output adheres to the constraints. However, during training,
miss-classification.
we backpropagate with the probabilities. This initially
N X
K
violates the constraint that each link can have at most one X 2
channel. However, through supervised learning, the model Lsup (P̂, P, 9̂, 9) = p̂i − pi + ψik log(ψ̂ik ) (11)
gradually learns to maximize the probability of selecting a i=0 k=0

single channel until it approaches 1 and consequently the We adopt the negative network’s mean rate as an unsuper-
other probabilities approach 0. vised loss. The value of the loss function decreases when the
Regarding the power allocation, we first apply a sigmoid data rate of each link, γi increases.
function to X . This transforms X into values within the range K
N X
of [0, 1]. We then compute the average of these values across
X W ψ̂ k
Lrate (9̂, P̂) = − i
log2 1 + SINRki (p̂ki , ψ̂ik )

the channel axes. Finally, we obtain the power allocation pi by N
i=0 k=0
multiplying the averaged value by pmax . The overall process (12)
is defined as follows:
K In order to ensure that every link retains the necessary
1 X pmax minimum capacity, we incorporate a regulation loss into our
pi = (10)
K 1 + exik model, strategically managing the rate of each link. Contrary
k=1
to the conventional approach that penalizes the model when
It can be straighforwardly demonstrated that the values of
rates drop below γmin [31], our method imposes a penalty
pi P falls always within the acceptable range, where pi ≤
1 when certain rates are excessively elevated, as expressed in
K ( k 1) · pmax ≤ 1 · pmax . equation (13). This strategy pushes the model to generate
Both the power vector and the channel allocation matrix
rates as near to γmin as possible. Consequently, while rates
originate from the same matrix X . Afterward, we apply
that are excessively high are brought down, those that are
different functions in the output layers to respect the
too low are also increased due to the shared radio resources.
constraints of each variable. Rather than employing distinct
Mathematically,
neural network blocks for each variable, we tackle the
problem jointly. The majority of the model’s parameters can N
1 X
be learned by optimizing a specific criterion. The following Lreg (9̂, P̂) = max(0, γi − γmin ) (13)
N
subsection will define the loss functions utilized to train our i=0
model. where Lreg computes the average extent to which each
rate, γi , surpasses the minimum, γmin , considering only the
D. LOSS FUNCTION DESIGN excesses, due to the max function. This regulatory loss
The loss function of our model is composed of three parts: aims to steer the model to adhere closely to the minimum
a supervised segment, an unsupervised segment, and a rate, preventing it from significantly exceeding it and thus
regulation loss designed to maintain the required minimum ensuring a consistent rate output across all links. When
data rate. Initially, the neural networks in the model leverage paired with the maximization of the network mean rates,
supervised learning to acquire a generalized strategy, drawing it lends appreciable stability to our model. We note that

VOLUME 2, 2024 723


while supervised training is preferred, it is not mandatory. Algorithm 2 Deployment Process Overview
In the absence of a labeled dataset, the model can learn INPUT: CSI G, pre-trained parameters 2∗
unsupervisedly. OUTPUT: P∗and 9 ∗ 
for Epoch in 0, MAX EPOCHS do:
E. TRAINING AND DEPLOYMENT PROCESS PreProcess G
We employ supervised learning to guide the model toward Compute Z
acquiring an optimal initial and generalized strategy derived Compute 9̂, P̂ (8, 10)
from the training dataset. During the deployment phase, Compute Lrate (9̂, P̂) + Lreg (9̂, P̂)
we alter the training direction with the objective of enhancing Update weights using ∇θ (Lrate + Lreg )
the rates across all links and focusing on respecting the Compute 9̂ (9)
minimum data rate by minimizing a combination of the rate
and regulation loss.
In both training and deployment, we first forward prop- obtained, as well as parallel computation functionalities that
agate to compute the prediction of P̂ and 9̂ to assess serve to expedite the dataset construction process.
the loss value. Afterward, we back-propagate to calculate
the gradients and update the model’s weights accordingly V. PERFORMANCE EVALUATION
using the Adadelta optimization technique [53]. For each In this section, we evaluate the effectiveness of our proposed
epoch, we preprocess each sample’s CSI and construct K approach through a series of experiments and comparisons.
interference graphs. Then, we parallelly compute the node Initially, we delve into a training convergence analysis, eval-
representations by evaluating (7) T times for all the K uating both supervised and unsupervised learning across vari-
graphs. This parallel computation decreases the execution ous wireless network parameters. Subsequently, we showcase
time by roughly 1/K . Following that, we calculate P̂ and that our model outperforms in diverse wireless network
9̂ accordingly. Lastly, we evaluate the loss and update the setups by comparing it with the benchmarking schemes,
learnable weights of the model. focusing on network mean rate, QoS violation probability,
and the level of QoS violation. Ultimately, we assess the
robustness of our model in scenarios characterized by an
Algorithm 1 Supervised Training Process Overview
imperfect channel.
INPUT: Dataset D contains tuples of (G,P,9)
OUTPUT: Model’s optimal parameters 2 A. SIMULATION PARAMETERS
Initialize P̂, 9̂
  We construct a rectangular 2D layout with width wx =
for Epoch in 0, MAX EPOCHS do:
200 m and height wy = 100 m that represents a
for all samples (Gl , Pl , 9l ) ∈ D (l = 1, . . . , L) do:
wireless environment, and we randomly distributed the
PreProcess Gl
N transmitters in the area. Consequently, we spread the
do in parallel: (k = 0, . . . , K )
receivers to be randomly distant from their corresponding
for all node iinV k (i = 1, . . . , N ) do:
transmitters in a range between dmin and dmax . We adopted
z0ki ← gii 
the channel model from the short-range outdoor model
for t in 0,T do: Compute ztki
ITU-1411 with a distance-dependent path-loss [55], with
Compute 9̂, P̂ 2.4 GHz carrier frequency, 1.5 m antenna height and
Compute Lsup (9̂, P̂) 2.5 dBi antenna gain. The transmit power’s maximum level
Update weights using ∇θ Lsup (9̂, P̂) is 4 dBm, and the background noise level is −169 dBm/Hz.
We model the shadowing using the normal distribution, αijk =
S
The supervised training procedure is thoroughly outlined 10 10 , S ∼ N (0, σ ) where σ is the shadowing deviation in
in Algorithm 1. Meanwhile, Algorithm 2 provides a detailed dB. We consider an urban outdoor environment, where σ is
description of the unsupervised deployment process, which, between 4 dB and 12 dB. As for the fast fading channel,
operates with a notably reduced number of iterations we use the Rayleigh fading model, hkij = R + jI ; I , R ∼
N√ (0,1)
compared to the training phase. . The number of channels is K = 10, where each has
2
While constructing the training dataset, we have the option 500 Hz of bandwidth.
of utilizing any MINLP optimization technique to obtain It is important to note that our model is trained only once
near-optimal solutions to our optimization problem. The with 10000 training samples over 30 epochs where N = 50,
exhaustive search is not feasible taking into account the dmax = 50 m, dmin = 5 m, γmin = 200 bps, and σ =
number of possible solutions. Therefore in our case, we have 4 dB generated by PYMOO Single-Objective Optimization
chosen to utilize PYMOO [54] due to its widespread applica- With Mixed Variables API [56]. We have also generated
bility and demonstrated efficacy in various fields. PYMOO 1000 samples for testing the convergence of the supervised
provides a range of flexible genetic algorithm techniques that training following the same procedure. All the simulations
include evaluation features capable of assessing the solutions and training have been conducted on the same hardware,

724 VOLUME 2, 2024


Marwani and Kaddoum: GNNs Approach for Joint Wireless Power Control and Spectrum Allocation

processor Intel Intel(R) Xeon(R) W-1270 CPU, 16.0 GB • DNN [31]: The CSI is reshaped and fed into two DNN
memory, and 3.40 GHz. The code1 is implemented using architectures. The first DNN architecture is responsible
Python 3.9 with Deep Graph Learning library (DGL [57]) and for power control, while the second DNN architecture
PyTorch as a backend. handles channel allocation. However, due to the DNN
Regarding network parameters, we configure the number architecture’s inflexibility towards variations in the
of GNN layers to T = 4, and the embedding dimension is number of links, we adjust the number of neurons in the
established at D = 10. We design MLP1 and MLP2 with DNN to be suitable for the selected N and retrain it with
three hidden layers, between these layers, we apply a ReLU an adequate dataset.
activation function and incorporated batch normalization. • REGNN [40]: The problem is addressed using a
Our CNN block comprised 3 convolution filters, with ReLU Resilient GNN policy, trained using an unsupervised
activation functions interposed. The parameters for the primal-dual approach. We average and normalize the
convolution filters, including strides, padding, and dilation, CSI across resource blocks to construct a graph topology
were all set to 1. The kernel size is defined as (3 × 3), similar to the one used in the paper.
ensuring that the spatial dimensions of the input tensor remain
unchanged. The channel input-output pairs are configured D. TRAINING CONVERGENCE ANALYSIS
as (10, 5, 2, 1). For the training and the deployment process, We present in this subsection convergence analyses for the
we opt for a fixed learning rate of 1.0 without any decay over supervised phase and the unsupervised phases in different
epochs, as Adadelta would adapt it accordingly. wireless scenarios.

B. COMPLEXITY ANALYSIS 1) SUPERVISED CONVERGENCE


Assuming sequential processing, the complexity Fig. 4 and Fig. 5 illustrate the convergence patterns of the
 of GNN Mean Squared Error (MSE) and categorical Cross Entropy
encoders is o(KTN [(N − 1) (D + 2)D + AD2 + BD2 ]),
where A and B denote the numbers of hidden layers in (CCE), respectively. Fig.6 shows the convergence pattern of
the message and update functions, respectively. Moreover, the supervised (SUP) loss, which is the sum of both MSE
we process the sub-graphs in parallel, thus the complexity and CCE losses. An exponential decrease is observed in the
is divided by K . Given the parameters we set previously, initial iterations, followed by the stabilization of the loss
the complexity isPo(240N (7N − 2)). Moreover, the CNN curves towards a minimum, which underscores the model’s
complexity is o([ c∈C ci−1 ci ]9NK ) where C = {D, . . . , 1} proficiency in emulating the RRM strategy inherent in the
is the set of consecutive number of channels, which in our training dataset. After the 5000-th iteration, a plateau in the
case, is o(588NK ). Combining the two phases, the overall loss values is noticeable, suggesting that the model might
complexity of a single feed-forward is o(240N (7N − 2) + have reached a state where further learning is limited and is
558NK ), which is roughly o(N 2 ). potentially trapped in a local minimum. Meanwhile, Fig. 7
demonstrates the progression of the Average network rate
C. BENCHMARKING SCHEMES with respect to the number of epochs (10000 iterations per
In our performance evaluation, we selected four distinct epoch), emphasizing that the model continues its learning
approaches for comparison: randomized, heuristic, convex- trajectory to align the performance between testing and
ification, and a learnable method. Since supervised training training samples, hence toward generalization.
data can be scarce, we refer to our model in the following
simulations as GNN when it is pre-trained in a supervised
manner before deployment, and DGNN when it is not.
We Benchmarking Schemes are explained as follows:
• RANDOM: We generate 40000 power and channel
allocation solutions at random. Subsequently, we select
the solution that minimizes QoS violations while
maximizing the mean network rate.
• PYMOO [56]: The problem is addressed utilizing a
genetic algorithm provided by PYMOO. Notably, this
solution is identical to the one used to generate the
FIGURE 4. Convergence of MSE loss.
training dataset.
• SLSQP [58]: Initially, greedy channel allocation [59] By the 30-th epoch, the test performance converges to the
is assigned to all links. Following this, we resolve the train performance, demonstrating that the model is adept at
power control task by employing the Sequential Least managing unseen samples. Such robust generalization capa-
Squares Programming (SLSQP) technique. bility can be linked to the permutation-invariant character of
the GNN architecture. When trained on graphs derived from
1 https://github.com/mahermarwani/Graph-Neural-Networks-Approach- the CSI, the model naturally undergoes data augmentation,
for-Joint-Wireless-Power-Control-and-Spectrum-Allocation enhancing the generalization performance.

VOLUME 2, 2024 725


FIGURE 5. Convergence of CCE loss.

FIGURE 8. Model’s average sum rate and QoS probability over


unsupervised learning in different N values.

FIGURE 6. Convergence of SUP loss.

FIGURE 7. Training vs. test model’s average sum rate evolution


FIGURE 9. Model’s average sum rate and QoS probability over
over supervised learning process.
unsupervised learning in different γmin values.

2) UNSUPERVISED CONVERGENCE Fig. 8 illustrates the impact of varying the number of links
Supervised learning typically results in an upper-bound on the average sum rate and QoS probability. We set N =
performance generated from the process used to create the {50, 75, 100, 125}, dmax = 50 m, γmin = 300 bps, and
training dataset, i.e., PYMOO. Thus, we demonstrate the σ = 4 dB. The top graph indicates a convergence trend
impact of unsupervised learning to increase the model’s in average sum rates for all the considered N values, with
performance by directly maximizing the mean network rate a rise of around 20% after 1000 iterations. Higher N values
while minimizing the QoS violations. We conduct several produce lower rates since the radio resources are finite and
tests across different wireless scenarios by adjusting the invariant. The bottom graph underscores the improvement
minimum required rate γmin , the number of links N , the dis- in QoS adherence over the iterations. For instance, N =
tribution of link locations dmax , and the shadowing deviation 125 decreases its initial violation probability from 17.5% to
σ . For every parameter change, we generate 100 samples nearly 1%, while N = 50 reduces it from 5% to almost 0%.
and analyze the average network mean rate evolution and This demonstrates that our approach can achieve improved
QoS violation probability over the unsupervised learning performances while maintaining strict QoS compliance for
iterations. various numbers of links. Fig. 9, on the other hand, evaluates

726 VOLUME 2, 2024


Marwani and Kaddoum: GNNs Approach for Joint Wireless Power Control and Spectrum Allocation

the model’s performance against changing QoS values,


specifically at γmin = {250, 500, 750, 1000} bps, N = 50,
dmax = 50 m, and σ = 4 dB. We kept the same CSI
in this analysis for a fair assessment. The Average Sum
Rate displays an increased performance across the examined
values, stabilizing near 2500 bps, a boost of approximately
31% from the initial supervised outcome. Meanwhile, the
QoS Violation Probability segment reveals a sharp decrease,
especially for the γmin = 1000 bps curve, and by the 800-th
iteration, all curves converge to under 5%, with most nearing
zero. This observation underscores the model’s aptitude to
adjust to fluctuating QoS values, constantly optimizing rates
and reducing QoS breaches.

FIGURE 11. Model’s average sum rate and QoS probability over
unsupervised learning in different σ values.

strengths that surpass deterministic predictions. The bottom


graph, depicting QoS Violation Probability, suggests that the
system is robust across diverse shadowing scenarios, with
higher and lower σ values witnessing significant violation
reductions over iterations, reaching nearly 0 violations. This
highlights the model’s increased performance across varying
shadowing deviations.
Overall, the supervised training provides a solid starting
point, yet it doesn’t achieve optimal results. Integrating
unsupervised training during the deployment phase shows a
notable improvement in the network’s mean rate. Simultane-
FIGURE 10. Model’s average sum rate and QoS probability over ously, the QoS violation probability reduces across various
unsupervised learning in different dmax values.
wireless configurations. This underscores the effectiveness
Fig. 10 explores the effects of different link distances, of our methodology in enhancing both performance and
where dmax = {50, 60, 70, 80} m, γmin = 300 bps, adaptability.
N = 50, and σ = 4 dB. Similarly, we observe a clear
trend: the Average Sum Rate increases over iterations by E. IMPACT OF QoS CONSTRAINTS ON PERFORMANCE
30 %, and higher dmax values produce lower rates due to This subsection examines the model’s (GNN & DGNN)
higher channel attenuations. However, although initially high, performance by changing the QoS values and comparing it
the QoS Violation Probability for all dmax values declines with established benchmark schemes. For a fair comparison,
rapidly, converging to 2% by the 800-th iteration. This the CSI remained the same while changing γmin , as it
emphasizes the model’s increased performance versatility is done during the convergence analysis. The Cumulative
across varying link distances. Finally, Fig. 11 showcases the Distribution Function (CDF) plots presented in Fig. 12
influence of shadowing, where γmin = 300 bps, N = provide a comparative analysis of the link rate performances
50, dmax = 50 m, and σ = {4, 6, 9, 12} dB. The top of five benchmark schemes for γmin = {300, 600, 1000} bps.
graph reveals that all curves consistently increase the Average A noticeable rightward skew of the GNN curve in each plot
Sum Rate as iterations continue. Higher values of σ lead highlights its capability to achieve higher link rates more
to slightly higher rates. This observation can be attributed often than other methods. Importantly, GNN consistently
to the fact that if shadowing results in a positive deviation surpasses other schemes in every scenario, even DGNN
(i.e., the signal strength is higher than expected), the SNR which shows the importance of the supervised learning
would increase, potentially leading to a higher Shannon phase. This superiority is especially evident at extreme γmin
capacity. However, it’s important to note that this doesn’t values where most schemes find it challenging to uphold
imply shadowing is inherently ‘‘beneficial.’’ Rather, the the QoS requirements. It’s worth noting that consistently
random nature of shadowing can occasionally produce signal maintaining extreme γmin can be unfeasible for various

VOLUME 2, 2024 727


FIGURE 12. Rate’s CDF of benchmarking schemes for different γmin values.

FIGURE 13. Performance of benchmarking schemes with respect to γmin .

channel realizations. Our model, as shown in (c), registers may occasionally be breached, the deviation of the link’s
approximately 5% QoS violations, whereas other schemes rate from γmin is still minimal (around 10%). This ensures
hover around 50%, except for the PYMOO and REGNN that even when violations occur, their impact remains largely
approach. inconsequential.
Next, we show the average sum rate of the links, the
probability of QoS constraint violation, and the level of F. IMPACT OF THE NUMBER OF LINKS
QoS violation, which is defined as the difference between In Figs. 14(a) - 14(c), we illustrate the average sum rate of
the QoS requirement and the rate when the QoS constraint the network, the probability of QoS violation, and the level
is violated, i.e., E [γmin − γ | γmin > γ ], as a function of of QoS violation against the number of links, N . Fig. 14(a)
γmin in Figs. 13(a-c), respectively. In Fig. 13(a), the average highlights that as the value of N ascends from 50 to 200,
sum rate is observed to decrease as γmin increases. This there is a marked decrease in the average sum rate across all
decline is attributable to the restrictions on all links’ transmit benchmark schemes. This downward trend can be attributed
power and channel usage, which are necessary to minimize to the intensified competition for available radio resources,
interference and, thus, to meet the QoS constraints. Notably, resulting from the addition of links to the network. Other
the proposed scheme maintains superior performance even benchmark schemes consistently fall behind our proposed
as its average sum rate marginally declines. Moreover, this scheme by margins of 15 − 30%, highlighting the superiority
behavior widens the performance disparity with the PYMOO of our proposed model. On the other hand, we observe in
scheme as γmin escalates. In contrast, other benchmark Fig. 14(b) and 14(c) a modest surge in both the probability
schemes consistently underperform relative to the proposed of QoS violation and its level as N proliferates. Despite this
strategy. Regarding Fig. 13(b), the QoS violation probability trend, our proposed scheme consistently outperforms other
for our model remains commendably low. Specifically, benchmark schemes, registering less than a 10% probability
it hovers close to 0 when γmin is minimal and reaches of violation and maintaining the level of violation below
approximately 5% at higher γmin values. This performance 50 bps at higher N values. These metrics restate the scheme’s
is comparable with the PYMOO, REGNN, and DGNN adeptness at maintaining QoS requirements, even in denser
approaches. In contrast, the RANDOM, SLSQP, and DNN networks. These tests highlight the scalability of our GNN-
schemes exhibit substantially elevated violation probabilities, based model. It consistently delivers robust performance
underscoring the efficacy of the proposed scheme. Lastly, across expansive networks without the need for retraining.
Fig. 13(c) highlights the proposed scheme’s performance in This characteristic emphasizes its capability to generalize
terms of QoS violation levels. Although the QoS constraint to larger graphs, while trained only on smaller graphs.

728 VOLUME 2, 2024


Marwani and Kaddoum: GNNs Approach for Joint Wireless Power Control and Spectrum Allocation

FIGURE 14. Performance of benchmarking schemes with respect to N.

In contrast, the DNN approach, even with retraining at modeling, has decreased the execution time by a factor of
each distinct N value, continually lags behind, revealing its K . As a result, considering a higher K value renders our
inherent limited scalability. approach much more efficient than the other schemes.

G. IMPACT OF THE LINKS LOCATIONS


In Figs. 16(a) - 16(c), we illustrate the average sum rate
of the network, the probability of QoS violation, and the
level of QoS violation against the maximum distance between
each link, dmax . In Fig. 16(a), the GNN curve starts at
approximately 2200 bps at 50 m and sees a steady
decline, settling just above 2000 bps by 80 m. Notably,
the GNN approach consistently surpasses the performance
of the RANDOM, DNN, REGNN, and PYMOO algorithms
across the entire distance range. Moving to Fig. 16(b),
GNN demonstrates remarkable stability, ensuring a violation
probability below 5% throughout all distances. This stability
sets it apart from other methods, particularly DNN and
SLSQP, the latter of which sees a hovering rate near 50%.
FIGURE 15. Execution time for different N values.
Lastly, Fig. 16(c) underscores GNN’s efficiency, with its
curve initiating at around 50 bps at 50 m and registering a
In Fig. 15, we assess the computation time required for slight surge to roughly 70 bps by 80 m. In this context, the
resource allocation as a function of N . The simulation results GNN outperforms most other schemes, with the exception
focus solely on the computation time of the deployment of PYMOO, which is precisely engineered to be resilient
phase, mainly because the GNN’s training occurs just against optimization constraints at the expense of an extensive
once and can be conducted offline prior to its actual execution time.
implementation. It is essential to note that the execution
time is intrinsically linked to our hardware specifications, H. IMPACT OF THE FADING EFFECTS
leading to results being presented in seconds instead of In Figs. 17(a) - 17(c), we demonstrate the average sum rate
milliseconds. Analyzing the data from Fig. 15, it is evident of the network, the probability of QoS violation, and the level
that the computation time for SLSQP and PYMOO increases of QoS violation against the shadowing deviation, σ . Among
almost linearly. This trend suggests that these methods may the tested schemes, the GNN consistently outperforms its
not be efficiently scalable for real-time operations. Contrarily, counterparts, achieving the highest average sum rate across
our method’s computation time is marginally more than the all shadowing deviations, as seen in Fig. 17(a). Furthermore,
DNN’s, attributed to the unsupervised iterations involved, when it comes to ensuring the quality of service, the GNN
with the benefit of a significant increase in the overall demonstrated resilience, exhibiting the lowest probability
performance. Moreover, the computation time remains unaf- and level of QoS violations, as shown in Fig. 17(b-c). This
fected by N . This is because the size of the GNN remains consistent superiority of the GNN emphasizes its potential
constant, irrespective of the number of links. For N = as a highly reliable solution in environments with varying
200, the computation time stands at 2.3 seconds. However, shadowing deviations.
this duration can be substantially reduced with superior
hardware, indicating the viability of our proposed scheme for I. IMPACT OF NOISY CSI
more extensive networks. Furthermore, incorporating parallel In this subsection, we evaluate the resilience of our model
computing in graph embedding, stemming from our graph when subjected to channel imperfections. These imperfec-

VOLUME 2, 2024 729


FIGURE 16. Performance of benchmarking schemes with respect to dmax .

FIGURE 17. Performance of benchmarking schemes with respect to σ .

FIGURE 18. Performance of benchmarking schemes with respect to σe .

tions are quantified using a specific expression that defines the noisy channel with the following equation:
the discrepancy between the estimated and actual multipath
fading effects. The relationship between the estimated g̃kij = βijk αijk |h̃kij |2 (15)
multipath fading values, hkij , and their true counterparts, h̃kij , To test the robustness, we first generate a clean CSI. We then
is described by the first-order Gauss-Markov process [60], apply Equation (15) to introduce varying degrees of noise
expressed as follows: to the clean CSI by adjusting σe within its defined range.
Using the noise-inflicted CSI, we proceed to determine the
q
power and channel allocation solutions for all benchmark
h̃kij = 1 − σe2 hkij + σe nkij (14)
schemes, and our approach. Upon finalizing these solutions,
we calculate the actual link rates, employing the clean
Here, nkij represents the error associated with the estimated CSI. This method allows us to measure the effectiveness
channel, hkij , which adheres to a complex Gaussian distri- of our model in real-world conditions where channels are
bution. The error coefficient σe characterizes the precision infrequently perfect.
of the CSI, where σe ranges from 0 to 1. A smaller value Fig. 18(a) to Fig. 18(c) demonstrate that the GNN scheme
of σe indicates a higher CSI accuracy, approaching perfect excels in performance against a rising distortion coefficient,
accuracy as σe tends to zero. Consequently, we can express σe . In graph (a), all benchmark schemes experience a reduced

730 VOLUME 2, 2024


Marwani and Kaddoum: GNNs Approach for Joint Wireless Power Control and Spectrum Allocation

average sum rate as σe increases due to the distortion, yet [9] F. Jameel, Z. Hamid, F. Jabeen, S. Zeadally, and M. A. Javed, ‘‘A survey of
the GNN maintains the highest rates, indicating a strong device-to-device communications: Research issues and challenges,’’ IEEE
Commun. Surveys Tuts., vol. 20, no. 3, pp. 2133–2168, 3rd Quart., 2018.
resistance to distortion. Graph (b) reveals that the GNN’s [10] S. Gupta, R. Patel, R. Gupta, S. Tanwar, and N. Patel, ‘‘A survey on
probability of QoS violations stays under 10%, contrasting resource allocation schemes in device-to-device communication,’’ in Proc.
with the marked vulnerability of other schemes under the 12th Int. Conf. Cloud Comput., Data Sci. Eng. (Confluence), Jan. 2022,
pp. 140–145.
same conditions. Graph (c) shows the GNN’s QoS violation
[11] Z. Lin and Y. Liu, ‘‘Joint uplink and downlink transmissions in user-
level remains below 10 bps, surpassing by far other schemes, centric OFDMA cloud-RAN,’’ IEEE Trans. Veh. Technol., vol. 68, no. 8,
which worsen with higher σe . GNN’s consistent robustness, pp. 7776–7788, Aug. 2019.
attributed to its permutation invariant features, showcases [12] M. K. Shehata, S. M. Gasser, H. M. El-Badawy, and M. E. Khedr,
‘‘Optimized dual uplink and downlink resource allocation for multiple
its superior design in mitigating distortion and preserving class of service in OFDM network,’’ in Proc. IEEE Int. Symp. Signal
service quality. Process. Inf. Technol. (ISSPIT), Dec. 2015, pp. 597–601.
[13] D. H. N. Nguyen, L. B. Le, and Z. Han, ‘‘Optimal uplink and downlink
channel assignment in a full-duplex multiuser system,’’ in Proc. IEEE Int.
VI. CONCLUSION Conf. Commun. (ICC), May 2016, pp. 1–6.
In conclusion, we presented a novel GNN-based framework [14] R. Ruby, S. Zhong, H. Yang, and K. Wu, ‘‘Enhanced uplink resource
for jointly solving power control and spectrum allocation allocation in non-orthogonal multiple access systems,’’ IEEE Trans.
Wireless Commun., vol. 17, no. 3, pp. 1432–1444, Mar. 2018.
in a non-orthogonal wireless environment. Our approach [15] C. Kai, L. Xu, J. Zhang, and M. Peng, ‘‘Joint uplink and downlink resource
demonstrated superior performance in terms of average allocation for D2D communication underlying cellular networks,’’ in Proc.
sum rate and Qos preservation in different wireless setups 10th Int. Conf. Wireless Commun. Signal Process. (WCSP), Oct. 2018,
pp. 1–6.
compared to other heuristic and learnable approaches and
[16] Y. Pan, C. Pan, Z. Yang, and M. Chen, ‘‘Resource allocation for D2D
achieved robustness over an imperfect channel. Additionally, communications underlaying a NOMA-based cellular network,’’ IEEE
our approach exhibited scalability, stability, and general- Wireless Commun. Lett., vol. 7, no. 1, pp. 130–133, Feb. 2018.
ization, making it suitable for various network structures [17] P. Mach, Z. Becvar, and M. Najla, ‘‘Resource allocation for D2D
communication with multiple D2D pairs reusing multiple channels,’’ IEEE
with different setups, such as D2D networks and Downlink- Wireless Commun. Lett., vol. 8, no. 4, pp. 1008–1011, Aug. 2019.
Uplink cellular scenarios. This study establishes a foundation [18] M. Najla, Z. Becvar, and P. Mach, ‘‘Reuse of multiple channels by multiple
for advanced RRM in future wireless networks. Future D2D pairs in dedicated mode: A game theoretic approach,’’ IEEE Trans.
Wireless Commun., vol. 20, no. 7, pp. 4313–4327, Jul. 2021.
research should delve into GNNs’ capabilities for dynamic
[19] S. Liu, Y. Wu, L. Li, X. Liu, and W. Xu, ‘‘A two-stage energy-
spectrum allocation, interference management, and network efficient approach for joint power control and channel allocation in D2D
optimization in changing conditions. Additionally, in-depth communication,’’ IEEE Access, vol. 7, pp. 16940–16951, 2019.
theoretical analysis is needed to pinpoint the best graph [20] T. O’Shea and J. Hoydis, ‘‘An introduction to deep learning for the physical
layer,’’ IEEE Trans. Cognit. Commun. Netw., vol. 3, no. 4, pp. 563–575,
representations of wireless networks and fine-tune the GNN Dec. 2017.
embedding layer. Lastly, incorporating temporal dynamics [21] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, ‘‘LORM: Learning to optimize
into GNN training could further improve RRM outcomes. for resource management in wireless networks with few training samples,’’
IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 665–679, Jan. 2020.
[22] F. Liang, C. Shen, W. Yu, and F. Wu, ‘‘Towards optimal power control via
REFERENCES ensembling deep neural networks,’’ IEEE Trans. Commun., vol. 68, no. 3,
[1] Y. Zhao, J. Zhao, W. Zhai, S. Sun, D. Niyato, and K. Lam, ‘‘A survey pp. 1760–1776, Mar. 2020.
of 6G wireless communications: Emerging technologies,’’ in Advances [23] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos,
in Information and Communication (Advances in Intelligent Systems and ‘‘Learning to optimize: Training deep neural networks for interfer-
Computing), vol. 1363, K. Arai, Ed. Cham, Switzerland: Springer, 2021. ence management,’’ IEEE Trans. Signal Process., vol. 66, no. 20,
[2] M. Alsabah et al., ‘‘6G wireless communications networks: A comprehen- pp. 5438–5453, Oct. 2018.
sive survey,’’ IEEE Access, vol. 9, pp. 148191–148243, 2021. [24] W. Lee, M. Kim, and D. Cho, ‘‘Deep power control: Transmit power
[3] C. D. Alwis et al., ‘‘Survey on 6G frontiers: Trends, applications, control scheme based on convolutional neural network,’’ IEEE Commun.
requirements, technologies and future research,’’ IEEE Open J. Commun. Lett., vol. 22, no. 6, pp. 1276–1279, Jun. 2018.
Soc., vol. 2, pp. 836–886, 2021. [25] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, ‘‘Graph neural networks for
[4] M. Moussaoui, E. Bertin, and N. Crespi, ‘‘5G shortcomings and beyond- scalable radio resource management: Architecture design and theoretical
5G/6G requirements,’’ in Proc. 1st Int. Conf. 6G Netw. (6GNet), Jul. 2022, analysis,’’ IEEE J. Sel. Areas Commun., vol. 39, no. 1, pp. 101–115,
pp. 1–8. Jan. 2021.
[5] T. Akhtar, C. Tselios, and I. Politis, ‘‘Radio resource management: [26] S. He et al., ‘‘An overview on the application of graph neural networks in
Approaches and implementations from 4G to 5G and beyond,’’ Wireless wireless networks,’’ IEEE Open J. Commun. Soc., vol. 2, pp. 2547–2565,
Netw., vol. 27, no. 1, pp. 693–734, Jan. 2021, doi: 10.1007/s11276-020- 2021.
02479-w. [27] L. Ruiz, F. Gama, and A. Ribeiro, ‘‘Graph neural networks: Architectures,
[6] F. Qamar, M. U. A. Siddiqui, M. N. Hindia, R. Hassan, and Q. N. Nguyen, stability, and transferability,’’ Proc. IEEE, vol. 109, no. 5, pp. 660–682,
‘‘Issues, challenges, and research trends in spectrum management: A May 2021.
comprehensive overview and new vision for designing 6G networks,’’ [28] M. Lee, G. Yu, and G. Y. Li, ‘‘Learning to branch: Accelerating resource
Electronics, vol. 9, no. 9, p. 1416, Sep. 2020. allocation in wireless networks,’’ IEEE Trans. Veh. Technol., vol. 69, no. 1,
[7] J. Huang, S. Huang, C.-C. Xing, and Y. Qian, ‘‘Game-theoretic power pp. 958–970, Jan. 2020.
control mechanisms for device-to-device communications underlaying [29] Z. Zhang and M. Tao, ‘‘Learning-based branch-and-bound for non-
cellular system,’’ IEEE Trans. Veh. Technol., vol. 67, no. 6, pp. 4890–4900, convex complex modulus constrained problems with applications in
Jun. 2018. wireless communications,’’ IEEE Trans. Wireless Commun., vol. 21, no. 6,
[8] A. Ramezani-Kebrya, M. Dong, B. Liang, G. Boudreau, and pp. 3752–3763, Jun. 2022.
S. H. Seyedmehdi, ‘‘Joint power optimization for device-to-device [30] T. Lin and Y. Zhu, ‘‘Beamforming design for large-scale antenna arrays
communication in cellular networks with interference control,’’ IEEE using deep learning,’’ IEEE Wireless Commun. Lett., vol. 9, no. 1,
Trans. Wireless Commun., vol. 16, no. 8, pp. 5131–5146, Aug. 2017. pp. 103–107, Jan. 2020.

VOLUME 2, 2024 731


[31] W. Lee and R. Schober, ‘‘Deep learning-based resource allocation [52] Z. Zhang, P. Cui, and W. Zhu, ‘‘Deep learning on graphs: A survey,’’ IEEE
for device-to-device communication,’’ IEEE Trans. Wireless Commun., Trans. Knowl. Data Eng., vol. 34, no. 1, pp. 249–270, Jan. 2022.
vol. 21, no. 7, pp. 5235–5250, Jul. 2022. [53] M. D. Zeiler, ‘‘ADADELTA: An adaptive learning rate method,’’ 2012,
[32] K. Nakashima, S. Kamiya, K. Ohtsu, K. Yamamoto, T. Nishio, and arXiv:1212.5701.
M. Morikura, ‘‘Deep reinforcement learning-based channel allocation for [54] W. Cui, K. Shen, and W. Yu, ‘‘Spatial deep learning for wireless
wireless lans with graph convolutional networks,’’ IEEE Access, vol. 8, scheduling,’’ in Proc. IEEE Global Commun. Conf. (GLOBECOM),
pp. 31823–31834, 2020. Dec. 2018, pp. 1–6.
[33] M. Eisen and A. Ribeiro, ‘‘Optimal wireless resource allocation with [55] (2009). Propagation Data and Prediction Methods for the Planning of
random edge graph neural networks,’’ IEEE Trans. Signal Process., vol. 68, Short-range Outdoor Radiocommunication Systems and Radio Local Area
pp. 2977–2991, 2020. Networks in the Frequency Range 300 Mhz to 100 Ghz. [Online]. Available:
[34] J. Guo and C. Yang, ‘‘Learning power control for cellular systems with https://api.semanticscholar.org/CorpusID
heterogeneous graph neural network,’’ in Proc. IEEE Wireless Commun. [56] J. Blank and K. Deb, ‘‘PYMOO: Multi-objective optimization in Python,’’
Netw. Conf. (WCNC), Mar. 2021, pp. 1–6. IEEE Access, vol. 8, pp. 89497–89509, 2020.
[35] X. Zhang, H. Zhao, J. Xiong, X. Liu, L. Zhou, and J. Wei, ‘‘Scalable power [57] M. Wang et al., ‘‘Deep graph library: A graph-centric, highly-performant
control/beamforming in heterogeneous wireless networks with graph package for graph neural networks,’’ 2019, arXiv:1909.01315.
neural networks,’’ in Proc. IEEE Global Commun. Conf. (GLOBECOM), [58] J. Nocedal and S. J. Wright, Numerical Optimization. Cham, Switzerland:
Dec. 2021, pp. 1–6. Springer, 1999.
[36] V. Ranasinghe, N. Rajatheva, and M. Latva-aho, ‘‘Graph neural network [59] Y. Sun, D. Xu, D. W. K. Ng, L. Dai, and R. Schober, ‘‘Optimal 3D-
based access point selection for cell-free massive MIMO systems,’’ in Proc. trajectory design and resource allocation for solar-powered UAV commu-
IEEE Global Commun. Conf. (GLOBECOM), Dec. 2021, pp. 1–6. nication systems,’’ IEEE Trans. Commun., vol. 67, no. 6, pp. 4281–4298,
[37] X. Zhang, Z. Zhang, and L. Yang, ‘‘Joint user association and power Jun. 2019.
allocation in heterogeneous ultra dense network via semi-supervised [60] C. He, C. Tian, C. Zhang, D. Feng, C. Pan, and F.-C. Zheng,
representation learning,’’ 2021, arXiv:2103.15367. ‘‘Energy efficiency optimization for distributed antenna systems with D2D
[38] A. Chowdhury, G. Verma, C. Rao, A. Swami, and S. Segarra, ‘‘Efficient communications under channel uncertainty,’’ IEEE Trans. Green Commun.
power allocation using graph neural networks and deep algorithm Netw., vol. 4, no. 4, pp. 1037–1047, Dec. 2020.
unfolding,’’ in Proc. IEEE Int. Conf. Acoust., Speech Signal Process.
MAHER MARWANI (Student Member, IEEE)
(ICASSP), Jun. 2021, pp. 4725–4729.
received the B.S. degree in engineering from
[39] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, ‘‘An iteratively weighted
École Polytechnique de Tunisie (EPT), Tunisia,
MMSE approach to distributed sum-utility maximization for a MIMO
interfering broadcast channel,’’ IEEE Trans. Signal Process., vol. 59, no. 9,
in 2017. He is currently pursuing the Ph.D.
pp. 4331–4340, Sep. 2011. degree in electrical engineering with École de
[40] N. NaderiAlizadeh, M. Eisen, and A. Ribeiro, ‘‘Learning resilient radio Technologie Supérieure. His research interests
resource management policies with graph neural networks,’’ IEEE Trans. include intelligent solutions for radio resource
Signal Process., vol. 71, pp. 995–1009, 2023. management (RRM) and leveraging advancements
[41] Y. Wang, Y. Li, Q. Shi, and Y.-C. Wu, ‘‘ENGNN: A general edge- in graph neural networks (GNNs).
update empowered GNN architecture for radio resource management in
wireless networks,’’ IEEE Trans. Wireless Commun., p. 1, 2023, doi: GEORGES KADDOUM (Senior Member,
10.1109/TWC.2023.3325735. IEEE) received the bachelor’s degree in electrical
[42] Z. Wang, M. Eisen, and A. Ribeiro, ‘‘Learning decentralized wireless
engineering from École Nationale Supérieure de
resource allocations with graph neural networks,’’ IEEE Trans. Signal
Techniques Avancées (ENSTA Bretagne), Brest,
Process., vol. 70, pp. 1850–1863, 2022.
France, the M.S. degree in telecommunications
[43] Z. He, L. Wang, H. Ye, G. Y. Li, and B. F. Juang, ‘‘Resource allocation
based on graph neural networks in vehicular communications,’’ in Proc.
and signal processing (circuits, systems, and
IEEE Global Commun. Conf., Dec. 2020, pp. 1–5. signal processing) from Université de Bretagne
[44] Y. Gu, C. She, Z. Quan, C. Qiu, and X. Xu, ‘‘Graph neural networks Occidentale and Telecom Bretagne (ENSTB),
for distributed power allocation in wireless networks: Aggregation over- Brest, in 2005, and the Ph.D. degree (Hons.) in
the-air,’’ IEEE Trans. Wireless Commun., vol. 22, no. 11, pp. 7551–7564, signal processing and telecommunications from
Nov. 2023. the National Institute of Applied Sciences (INSA), University of Toulouse,
[45] Y. Shen, J. Zhang, S. H. Song, and K. B. Letaief, ‘‘Graph neural networks Toulouse, France, in 2009. He is currently a Professor and the Research
for wireless communications: From theory to practice,’’ IEEE Trans. Director of the Resilient Machine Learning Institute (ReMI), and the
Wireless Commun., vol. 22, no. 5, pp. 3554–3569, May 2023. Tier 2 Canada Research Chair of École de Technologie Supérieure (ÉTS),
[46] N. NaderiAlizadeh, M. Eisen, and A. Ribeiro, ‘‘State-augmented learnable Université du Québec, Montréal, Canada. He has published more than
algorithms for resource management in wireless networks,’’ IEEE Trans. 300 journal articles, conference papers, and two chapters in books,
Signal Process., vol. 70, pp. 5898–5912, 2022. and has eight pending patents. His research interests include wireless
[47] P. Cheng, G. Chen, and Z. Han, ‘‘Graph neural networks based resource communication networks, tactical communications, resource allocations,
allocation in heterogeneous wireless networks,’’ in Proc. 7th Int. Conf. and network security. He received the Best Papers Award from the 2014 IEEE
Intell. Inf. Process. New York, NY, USA: Association for Computing International Conference on Wireless and Mobile Computing, Networking,
Machinery, 2023, doi: 10.1145/3570236.3570293. Communications (WIMOB), the 2017 IEEE International Symposium
[48] T. Chen, X. Zhang, M. You, G. Zheng, and S. Lambotharan, ‘‘A GNN- on Personal Indoor and Mobile Radio Communications (PIMRC), and
based supervised learning framework for resource allocation in wireless the 2023 IEEE International Wireless Communications and Mobile
IoT networks,’’ IEEE Internet Things J., vol. 9, no. 3, pp. 1712–1724,
Computing Conference (IWCMC). He received the IEEE Transactions
Feb. 2022.
on Communications Exemplary Reviewer Award in 2015, 2017, and
[49] Y. Peng, J. Guo, and C. Yang, ‘‘Learning resource allocation policy: Vertex-
2019. He received the Research Excellence Award from Université du
GNN or edge-GNN?’’ IEEE Trans. Mach. Learn. Commun. Netw., vol. 2,
pp. 190–209, 2024. Québec in 2018. In 2019, he received the Research Excellence Award
[50] Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and P. S. Yu, from ÉTS in recognition of his outstanding research outcomes. He also
‘‘A comprehensive survey on graph neural networks,’’ IEEE Trans. Neural won the 2022 IEEE Technical Committee on Scalable Computing (TCSC)
Netw. Learn. Syst., vol. 32, no. 1, pp. 4–24, Jan. 2021. Award for Excellence (Middle Career Researcher). He has received the
[51] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, prestigious 2023 MITACS Award for Exceptional Leadership. He served
‘‘Neural message passing for quantum chemistry,’’ in Proc. 34th Int. Conf. as an Associate Editor for IEEE Transactions on Information Forensics and
Mach. Learn., in Proceedings of Machine Learning Research, vol. 70, Security and IEEE Communications Letters. He is also serving as an Area
D. Precup and Y. W. Teh, Eds., 2017, pp. 1263–1272. [Online]. Available: Editor for IEEE Transactions on Machine Learning in Communications and
https://proceedings.mlr.press/v70/gilmer17a.html Networking and an Editor for IEEE Transactions on Communications.

732 VOLUME 2, 2024

You might also like