Graph Neural Networks Approach For Joint Wireless Power Control and Spectrum Allocation
Graph Neural Networks Approach For Joint Wireless Power Control and Spectrum Allocation
ABSTRACT The proliferation of wireless technologies and the escalating performance requirements
of wireless applications have led to diverse and dynamic wireless environments, presenting formidable
challenges to existing radio resource management (RRM) frameworks. Researchers have proposed utilizing
deep learning (DL) models to address these challenges to learn patterns from wireless data and leverage the
extracted information to resolve multiple RRM tasks, such as channel allocation and power control. However,
it is noteworthy that the majority of existing DL architectures are designed to operate on Euclidean data,
thereby disregarding a substantial amount of information about the topological structure of wireless networks.
As a result, the performance of DL models may be suboptimal when applied to wireless environments
due to the failure to capture the network’s non-Euclidean geometry. This study presents a novel approach
to address the challenge of power control and spectrum allocation in an N-link interference environment
with shared channels, utilizing a graph neural network (GNN) based framework. In this type of wireless
environment, the available bandwidth can be divided into blocks, offering greater flexibility in allocating
bandwidth to communication links, but also requiring effective management of interference. One potential
solution to mitigate the impact of interference is to control the transmission power of each link while ensuring
the network’s data rate performance. Therefore, the power control and spectrum allocation problems are
inherently coupled and should be solved jointly. The proposed GNN-based framework presents a promising
avenue for tackling this complex challenge. Our experimental results demonstrate that our proposed approach
yields significant improvements compared to other existing methods in terms of convergence, generalization,
performance, and robustness, particularly in the context of an imperfect channel.
INDEX TERMS Intelligent resource allocation, RRM, 6G, GNN, D2D, AI.
2024 The Authors. This work is licensed under a Creative Commons Attribution 4.0 License.
For more information, see https://creativecommons.org/licenses/by/4.0/
VOLUME 2, 2024 717
architecture characterized by multiple communication links promising results in various domains, indicating its potential
sharing the same available bandwidth. In this setting, the as an effective technique for enhancing the performance of
co-existence of multiple links causes significant interference RRM algorithms in wireless networks.
and performance degradation, which necessitate effective The primary aim of this research is to propose a solution
management of transmission power control and spectrum that simultaneously addresses spectrum allocation and power
allocation. This network structure can be observed in control tasks. Initially, we formulate a network mean rate
various wireless scenarios, including device-to-device (D2D) maximization problem, considering both RRM tasks and
communication [7], [8], [9], [10], where multiple devices the minimum Quality of Service (QoS) required for each
communicate directly without a network infrastructure, and communication link. Subsequently, we create interference
uplink/downlink [11], [12], [13], [14], [15] scenarios, where a graphs from the network’s CSI, enabling parallel processing
base station communicates with multiple users using the same without information loss. Additionally, we develop an end-
spectrum, also known as non-orthogonal multiple access to-end GNN-based framework that learns from these con-
(NOMA). To maximize the network’s sum rate, solving the structed graphs and embeds them into Euclidean space. This
mixed-integer, non-convex optimization problem involving embedding is used to compute power and channel allocation
power control and channel allocation is essential. However, solutions. In contrast to Deep Neural Network (DNN)
obtaining a globally optimal solution within the required time models, our framework is both scalable and generalizable,
is challenging. Therefore, researchers have proposed several requiring no retraining or architectural modification when
near-optimal solutions for specific cases [16], [17], [18], [19], changing the input size. It also excels in computational
which tend to have high computational complexity and are efficiency due to parallel execution. To enhance the model’s
impractical for real-time scenarios. generalization and training stability, we combine four loss
In recent years, researchers have explored the use of functions: the supervised mean squared error for power
machine learning (ML) techniques to address wireless control, the supervised cross-entropy for channel allocation,
network optimization problems. Specifically, there has been an unsupervised loss to avoid constraining the model with
interest in incorporating deep learning (DL) approaches, an upper-bound performance from supervised training, and
which have shown promise in a variety of applications. Two a regulation loss to ensure QoS constraints are met. Lastly,
primary approaches have been pursued in this integration: (1) we rigorously tested our approach, focusing on the training
constructing end-to-end learnable architectures that can cap- convergence, generalization across different wireless setups,
ture complex relationships between inputs and outputs [20], network mean rate, QoS violation, scalability with input
[21], [22], and (2) replacing computational blocks within size, and robustness in the presence of imperfect channel
existing solutions with DL architectures to reduce compu- estimation.
tational costs [23], [24]. Despite promising results, existing This paper is structured as follows. In Section II,
DL-based approaches have primarily focused on addressing a comprehensive literature review is presented to explore
isolated RRM tasks such as power control, user association, the previous work related to our research. In Section III,
and link scheduling. Moreover, their scalability to large we introduce the N-link interference environment with shared
wireless networks is a concern as they scale linearly with channels, which is the problem setting that our proposed
respect to the size of the input data. Furthermore, techniques solution is designed to address. In Section IV, we present
such as multi-layer perceptrons (MLPs) and convolutional the optimization problem that we aim to solve. In Section V,
neural networks (CNNs) can be subjected to overfitting and, we provide a detailed description of our proposed end-to-end
thus, require large amounts of training data. Additionally, solution architecture, which consists of various components,
these methods rely on tabular data, such as channel state including CSI preprocessing, the GNN feature extractor, the
information (CSI), which ignores the network’s underlying MLP component, the loss function design, and the training
topology. Therefore, there is a need for further research to process. In Section VI, we conduct extensive simulations
explore more effective ways to integrate DL approaches into to evaluate our proposed solution’s performance in terms
wireless network optimization problems. Recent research has of stability, generalization, and robustness compared to the
demonstrated the potential for improving the scalability and state-of-the-art methods. Finally, in Section 7, we present our
generalization of DL-based RRM solutions by integrating conclusions and future research directions.
the target task’s structure into the neural network architec-
ture [25]. Given that wireless networks can be intuitively II. LITERATURE REVIEW
modeled as graph topologies, there is a growing interest Numerous studies have focused on solving the power
in leveraging graph representation learning techniques to control and spectrum allocation problems in different
enhance the performance of RRM algorithms [26]. One network topologies, either independently or concurrently.
such approach is Graph Neural Networks (GNNs), which For example, in [16], the authors proposed a dual-based
possess several attractive properties, including permutation iterative algorithm that allocates resources to D2D pairs while
equivariance, scalability, generalization, high computational maintaining the quality of service requirements. Another
efficiency, and the ability to train efficiently on relatively study [19] developed a two-stage algorithm to maximize
small datasets [27]. The application of GNNs has yielded the energy efficiency of D2D communication under cellular
constraints, assuming that each D2D link could use one sub- To facilitate the incorporation of input data structures into
channel at most to decrease the computational complexity. DL models for RRM tasks, several solutions based on GNNs
Conversely, this research [17] proposed a channel and power have been introduced [26]. These approaches have exhibited
allocation scheme with channel reuse based on the Hungarian promise in tackling various RRM tasks, encompassing
algorithm and a prioritizing method. Moreover, this work [18] channel allocation, power control, and user association.
employed a game-theory approach to manage the reuse of For instance, in [32], a framework combining Deep
multiple channels by multiple D2D pairs. Despite the various Reinforcement Learning (DRL) and Graph Convolution
proposed solutions, most of them were heuristic approaches Networks (GCNs) was proposed for channel allocation.
or tended to convexify the RRM problems, resulting in This method enabled the agent to learn optimal channel
a high computational complexity. Additionally, they did assignments to access points using features extracted from
not provide complete flexibility in allocating multiple the wireless environment as a state space. However, the
channels to multiple links, mainly because of convergence model’s testing was confined to a wireless setting with perfect
issues. channels and a relatively small number of devices. Similarly,
Given the limitations of model-based and heuristic solu- [33] introduced a GNN-based framework for learning
tions, researchers have turned to learnable approaches by resource allocation strategies in wireless networks, offering
integrating DL architecture to tackle RRM problems. For reduced training times and improved scalability compared
example, these works [21], [28], [29] integrated a DL to conventional MLPs. Nonetheless, this framework was
component to learn the optimal pruning policy for the not well-suited for heterogeneous wireless devices or
branch-and-bound (B&B) algorithm to solve mixed-integer systems with single or multiple antennas. To address these
nonlinear programming (MINLP) problems. While this limitations, [25] presented a more flexible GNN-based
approach simplifies the problem significantly and reduces the solution for constrained power allocation in a heterogeneous
exponential computation of the traditional B&B algorithm, MIMO-interfering environment. Leveraging the permutation-
intense sampling is required to train the DL architecture invariant properties of RRM problems, this GNN architecture
since the training is supervised. To alleviate the need for demonstrated excellent generalization across different prob-
training data, unsupervised learning approaches have been lem scales with minimal training data. In addition, in [34],
explored [30]. This work considers constructing a DNN researchers aimed to find the optimal power control strategy
framework to solve beamforming problems over an imperfect in an uplink multi-cell network by combining DNNs with
channel, which is trained in an unsupervised fashion using knowledge from the wireless network’s topology, reducing
the negative of the sum rate. Similarly, [22] constructed training complexity and model parameters. However, this
a DNN that takes CSI and computes the power of each approach was tailored exclusively to the power control
user. However, this work did not integrate the minimum rate task. In contrast, [35] proposed employing GNNs to tackle
constraints into the training process, which raises questions power control and beamforming issues in heterogeneous
about the solution’s feasibility. Another approach to training D2D networks. Here, communication and interference links
DL models is combining supervised and unsupervised losses. were represented as vertices in the wireless graph, and
For instance, [31] proposed an end-to-end DL framework to an unsupervised learning process was employed for the
solve resource allocation in multi-channel cellular systems graph convolutional model. This method demonstrated
with D2D links. Moreover, the approach can be implemented favorable properties such as scalability and reduced execution
in a centralized manner, with full knowledge of the CSI, time compared to alternative approaches. Similarly, [36]
or distributed manner with partial CSI. However, the authors introduced an Access Point (AP) selection strategy for
transform the continuous power variable into a set of discrete massive cell-free Multiple Input, Multiple Output (MIMO)
levels in order to use the cross-entropy loss. Following the systems based on GNNs. The authors constructed two graphs:
same training approach in [24], a CNN model is employed to a homogeneous one representing only AP nodes and a
learn the patterns from CSI and output the power control that heterogeneous one containing both user equipments and
maximizes the energy or spectrum efficiency of the network. AP nodes. However, these methods modeled the wireless
Despite the promising results achieved by the current DNN network as a single graph, assuming that all communication
and CNN approaches, their lack of flexibility with input links interfered with each other. GNNs can serve as
sizes is a significant limitation. Any alteration in input shape end-to-end learnable solutions or feature extractors. For
necessitates architectural modification. Furthermore, they example, [37] proposed a joint optimization framework
prove inadequate in large wireless scenarios with a substantial for user association and power control in a heterogeneous
number of connected devices. This deficiency stems from ultra-dense network. Similarly, [38] improved the Itera-
their heavy reliance on the quality of training data, which tively Weighted Minimum Mean Square Error (WMMSE)
can be challenging to obtain in real-life situations. Moreover, algorithm [39] by incorporating trainable components
the training process for these models is often time-consuming parametrized by GNNs. Simulations illustrated that the pro-
and typically conducted offline. Another drawback of DNN posed method, unfolded WMMSE, delivered a comparable
and CNN approaches is their disregard for the geometric performance to WMMSE but with significantly lower time
information inherent in the input data. complexity.
VOLUME 2, 2024 719
The work in [40] introduced a trainable resilient RRM
policy using an unsupervised primal-dual approach for power
control and user association. Another paper [41] presented
an edge-update empowered GNN architecture, enhancing
GNNs’ ability to handle node and edge variables and
validating its Permutation Equivariance in power allocation
scenarios. Additionally, [42] introduced Aggregation GNNs
for decentralized resource allocation in wireless networks,
utilizing a model-free primal-dual approach for asynchronous
local information processing. The study in [43] proposed
a distributed spectrum allocation scheme for vehicle-to-
everything (V2X) networks using GNNs and multi-agent RL
to optimize the network capacity. Furthermore, [44] discussed
GNN-based frameworks for distributed power allocation in
wireless networks, aimed at minimizing signaling overhead
by incorporating Recurrent Neural Networks (RNNs) to
capture temporal dynamics. The work in [45] offered a FIGURE 1. N-link interference environment with shared channels.
GNN framework to enhance power control and hybrid
precoding in wireless systems, demonstrating scalability and
efficiency. The work in [46] proposed a state-augmented independent resource allocation for each frame. We introduce
algorithm for RRM in multi-user networks, ensuring feasible gkii ∈ R to represent the direct channel gain between the
and nearly optimal decisions. In addition, [47] introduced transmitter and receiver of the k-th RB in the i-th link, and
a Heterogeneous GNN model for resource allocation in gkij ∈ R to denote the channel gain between the transmitter of
heterogeneous networks. The study in [48] presented a the j-th link and the receiver of the i-th link. To represent all
GNN-based scheme for RRM in wireless IoT networks, channel gains, we define G = [G1 , . . . , GK ] ∈ RN ×N ×K as
optimizing resources in D2D communications. Lastly, [49] the CSI tensor of all links in all resource blocks, where:
explored the expressive power of GNNs in learning wireless k
g11 gk12 · · · gk1N
policies, highlighting the limitations of Vertex-GNNs and the gk gk22 · · · gk2N
21
advantages of Edge-GNNs in resource allocation tasks. Gk = . .. .. .. . (1)
Nevertheless, most of the mentioned GNN-based works .. . . .
are not adaptable to environments where multiple resource gkN 1 gkN 2 ··· gkNN
blocks have varying CSI. Additionally, many of these
methods focus on addressing individual RRM tasks. Taking into consideration the most common types of fading,
the channel gain formula can be expressed as:
zt+1
v = MLPt2 (ztv , max{mt+1
vu , u ∈ N (v))}) (7)
We apply dimension manipulation techniques to process from the diverse array of solutions within our dataset
X for our desired outputs—power vector and a Resource generated by PYMOO. This dataset, rich in variety due
Block (RB) matrix. We treat X as a matrix for channel to alterations in minimum data rates, the number of links,
allocation and use a softmax function across the relevant and link distance variations, equips the model with a wide-
dimension to calculate channel probabilities. For power ranging understanding of potential wireless configurations.
control, we condense X along the K -axis into an N × Such a broad perspective is crucial for the model’s ability to
1 vector and apply a sigmoid function to normalize the power effectively adapt and fine-tune to specific scenarios during the
values to a range between 0 and 1. We employ a softmax inference phase. Following the supervised learning stage, the
operation along the channel axis to compute the probabilities networks further refine their ability to optimize the objective
of selecting a channel. These probabilities are calculated as: through unsupervised learning, all the while adhering to
exik optimization constraints enforced by the regulation loss.
aki = PK (8) This holistic approach ensures that the model not only
xik
s=1 e learns generalizable strategies but also enhances its objective
Consequently, the channel allocation is calculated as follows: maximization capabilities and compliance with necessary
( constraints.
1 if j = argmaxk ({aki , k ∈ K}) Specifically, power control is a supervised continuous
9i =
k
(9)
0 otherwise prediction problem; thus, we employ the mean square error
(MSE) to determine the prediction’s cost function. Moreover,
The current formulation of 9 is not differentiable and would
we consider the channel allocation as a multi-label supervised
break the chain rule of backpropagation. Therefore, during
classification problem. Thus, we use the categorical cross-
deployment, we utilize this formulation to ensure that the
entropy (CCE) loss to calculate the cost of the sample’s
output adheres to the constraints. However, during training,
miss-classification.
we backpropagate with the probabilities. This initially
N X
K
violates the constraint that each link can have at most one X 2
channel. However, through supervised learning, the model Lsup (P̂, P, 9̂, 9) = p̂i − pi + ψik log(ψ̂ik ) (11)
gradually learns to maximize the probability of selecting a i=0 k=0
single channel until it approaches 1 and consequently the We adopt the negative network’s mean rate as an unsuper-
other probabilities approach 0. vised loss. The value of the loss function decreases when the
Regarding the power allocation, we first apply a sigmoid data rate of each link, γi increases.
function to X . This transforms X into values within the range K
N X
of [0, 1]. We then compute the average of these values across
X W ψ̂ k
Lrate (9̂, P̂) = − i
log2 1 + SINRki (p̂ki , ψ̂ik )
the channel axes. Finally, we obtain the power allocation pi by N
i=0 k=0
multiplying the averaged value by pmax . The overall process (12)
is defined as follows:
K In order to ensure that every link retains the necessary
1 X pmax minimum capacity, we incorporate a regulation loss into our
pi = (10)
K 1 + exik model, strategically managing the rate of each link. Contrary
k=1
to the conventional approach that penalizes the model when
It can be straighforwardly demonstrated that the values of
rates drop below γmin [31], our method imposes a penalty
pi P falls always within the acceptable range, where pi ≤
1 when certain rates are excessively elevated, as expressed in
K ( k 1) · pmax ≤ 1 · pmax . equation (13). This strategy pushes the model to generate
Both the power vector and the channel allocation matrix
rates as near to γmin as possible. Consequently, while rates
originate from the same matrix X . Afterward, we apply
that are excessively high are brought down, those that are
different functions in the output layers to respect the
too low are also increased due to the shared radio resources.
constraints of each variable. Rather than employing distinct
Mathematically,
neural network blocks for each variable, we tackle the
problem jointly. The majority of the model’s parameters can N
1 X
be learned by optimizing a specific criterion. The following Lreg (9̂, P̂) = max(0, γi − γmin ) (13)
N
subsection will define the loss functions utilized to train our i=0
model. where Lreg computes the average extent to which each
rate, γi , surpasses the minimum, γmin , considering only the
D. LOSS FUNCTION DESIGN excesses, due to the max function. This regulatory loss
The loss function of our model is composed of three parts: aims to steer the model to adhere closely to the minimum
a supervised segment, an unsupervised segment, and a rate, preventing it from significantly exceeding it and thus
regulation loss designed to maintain the required minimum ensuring a consistent rate output across all links. When
data rate. Initially, the neural networks in the model leverage paired with the maximization of the network mean rates,
supervised learning to acquire a generalized strategy, drawing it lends appreciable stability to our model. We note that
processor Intel Intel(R) Xeon(R) W-1270 CPU, 16.0 GB • DNN [31]: The CSI is reshaped and fed into two DNN
memory, and 3.40 GHz. The code1 is implemented using architectures. The first DNN architecture is responsible
Python 3.9 with Deep Graph Learning library (DGL [57]) and for power control, while the second DNN architecture
PyTorch as a backend. handles channel allocation. However, due to the DNN
Regarding network parameters, we configure the number architecture’s inflexibility towards variations in the
of GNN layers to T = 4, and the embedding dimension is number of links, we adjust the number of neurons in the
established at D = 10. We design MLP1 and MLP2 with DNN to be suitable for the selected N and retrain it with
three hidden layers, between these layers, we apply a ReLU an adequate dataset.
activation function and incorporated batch normalization. • REGNN [40]: The problem is addressed using a
Our CNN block comprised 3 convolution filters, with ReLU Resilient GNN policy, trained using an unsupervised
activation functions interposed. The parameters for the primal-dual approach. We average and normalize the
convolution filters, including strides, padding, and dilation, CSI across resource blocks to construct a graph topology
were all set to 1. The kernel size is defined as (3 × 3), similar to the one used in the paper.
ensuring that the spatial dimensions of the input tensor remain
unchanged. The channel input-output pairs are configured D. TRAINING CONVERGENCE ANALYSIS
as (10, 5, 2, 1). For the training and the deployment process, We present in this subsection convergence analyses for the
we opt for a fixed learning rate of 1.0 without any decay over supervised phase and the unsupervised phases in different
epochs, as Adadelta would adapt it accordingly. wireless scenarios.
2) UNSUPERVISED CONVERGENCE Fig. 8 illustrates the impact of varying the number of links
Supervised learning typically results in an upper-bound on the average sum rate and QoS probability. We set N =
performance generated from the process used to create the {50, 75, 100, 125}, dmax = 50 m, γmin = 300 bps, and
training dataset, i.e., PYMOO. Thus, we demonstrate the σ = 4 dB. The top graph indicates a convergence trend
impact of unsupervised learning to increase the model’s in average sum rates for all the considered N values, with
performance by directly maximizing the mean network rate a rise of around 20% after 1000 iterations. Higher N values
while minimizing the QoS violations. We conduct several produce lower rates since the radio resources are finite and
tests across different wireless scenarios by adjusting the invariant. The bottom graph underscores the improvement
minimum required rate γmin , the number of links N , the dis- in QoS adherence over the iterations. For instance, N =
tribution of link locations dmax , and the shadowing deviation 125 decreases its initial violation probability from 17.5% to
σ . For every parameter change, we generate 100 samples nearly 1%, while N = 50 reduces it from 5% to almost 0%.
and analyze the average network mean rate evolution and This demonstrates that our approach can achieve improved
QoS violation probability over the unsupervised learning performances while maintaining strict QoS compliance for
iterations. various numbers of links. Fig. 9, on the other hand, evaluates
FIGURE 11. Model’s average sum rate and QoS probability over
unsupervised learning in different σ values.
channel realizations. Our model, as shown in (c), registers may occasionally be breached, the deviation of the link’s
approximately 5% QoS violations, whereas other schemes rate from γmin is still minimal (around 10%). This ensures
hover around 50%, except for the PYMOO and REGNN that even when violations occur, their impact remains largely
approach. inconsequential.
Next, we show the average sum rate of the links, the
probability of QoS constraint violation, and the level of F. IMPACT OF THE NUMBER OF LINKS
QoS violation, which is defined as the difference between In Figs. 14(a) - 14(c), we illustrate the average sum rate of
the QoS requirement and the rate when the QoS constraint the network, the probability of QoS violation, and the level
is violated, i.e., E [γmin − γ | γmin > γ ], as a function of of QoS violation against the number of links, N . Fig. 14(a)
γmin in Figs. 13(a-c), respectively. In Fig. 13(a), the average highlights that as the value of N ascends from 50 to 200,
sum rate is observed to decrease as γmin increases. This there is a marked decrease in the average sum rate across all
decline is attributable to the restrictions on all links’ transmit benchmark schemes. This downward trend can be attributed
power and channel usage, which are necessary to minimize to the intensified competition for available radio resources,
interference and, thus, to meet the QoS constraints. Notably, resulting from the addition of links to the network. Other
the proposed scheme maintains superior performance even benchmark schemes consistently fall behind our proposed
as its average sum rate marginally declines. Moreover, this scheme by margins of 15 − 30%, highlighting the superiority
behavior widens the performance disparity with the PYMOO of our proposed model. On the other hand, we observe in
scheme as γmin escalates. In contrast, other benchmark Fig. 14(b) and 14(c) a modest surge in both the probability
schemes consistently underperform relative to the proposed of QoS violation and its level as N proliferates. Despite this
strategy. Regarding Fig. 13(b), the QoS violation probability trend, our proposed scheme consistently outperforms other
for our model remains commendably low. Specifically, benchmark schemes, registering less than a 10% probability
it hovers close to 0 when γmin is minimal and reaches of violation and maintaining the level of violation below
approximately 5% at higher γmin values. This performance 50 bps at higher N values. These metrics restate the scheme’s
is comparable with the PYMOO, REGNN, and DGNN adeptness at maintaining QoS requirements, even in denser
approaches. In contrast, the RANDOM, SLSQP, and DNN networks. These tests highlight the scalability of our GNN-
schemes exhibit substantially elevated violation probabilities, based model. It consistently delivers robust performance
underscoring the efficacy of the proposed scheme. Lastly, across expansive networks without the need for retraining.
Fig. 13(c) highlights the proposed scheme’s performance in This characteristic emphasizes its capability to generalize
terms of QoS violation levels. Although the QoS constraint to larger graphs, while trained only on smaller graphs.
In contrast, the DNN approach, even with retraining at modeling, has decreased the execution time by a factor of
each distinct N value, continually lags behind, revealing its K . As a result, considering a higher K value renders our
inherent limited scalability. approach much more efficient than the other schemes.
tions are quantified using a specific expression that defines the noisy channel with the following equation:
the discrepancy between the estimated and actual multipath
fading effects. The relationship between the estimated g̃kij = βijk αijk |h̃kij |2 (15)
multipath fading values, hkij , and their true counterparts, h̃kij , To test the robustness, we first generate a clean CSI. We then
is described by the first-order Gauss-Markov process [60], apply Equation (15) to introduce varying degrees of noise
expressed as follows: to the clean CSI by adjusting σe within its defined range.
Using the noise-inflicted CSI, we proceed to determine the
q
power and channel allocation solutions for all benchmark
h̃kij = 1 − σe2 hkij + σe nkij (14)
schemes, and our approach. Upon finalizing these solutions,
we calculate the actual link rates, employing the clean
Here, nkij represents the error associated with the estimated CSI. This method allows us to measure the effectiveness
channel, hkij , which adheres to a complex Gaussian distri- of our model in real-world conditions where channels are
bution. The error coefficient σe characterizes the precision infrequently perfect.
of the CSI, where σe ranges from 0 to 1. A smaller value Fig. 18(a) to Fig. 18(c) demonstrate that the GNN scheme
of σe indicates a higher CSI accuracy, approaching perfect excels in performance against a rising distortion coefficient,
accuracy as σe tends to zero. Consequently, we can express σe . In graph (a), all benchmark schemes experience a reduced
average sum rate as σe increases due to the distortion, yet [9] F. Jameel, Z. Hamid, F. Jabeen, S. Zeadally, and M. A. Javed, ‘‘A survey of
the GNN maintains the highest rates, indicating a strong device-to-device communications: Research issues and challenges,’’ IEEE
Commun. Surveys Tuts., vol. 20, no. 3, pp. 2133–2168, 3rd Quart., 2018.
resistance to distortion. Graph (b) reveals that the GNN’s [10] S. Gupta, R. Patel, R. Gupta, S. Tanwar, and N. Patel, ‘‘A survey on
probability of QoS violations stays under 10%, contrasting resource allocation schemes in device-to-device communication,’’ in Proc.
with the marked vulnerability of other schemes under the 12th Int. Conf. Cloud Comput., Data Sci. Eng. (Confluence), Jan. 2022,
pp. 140–145.
same conditions. Graph (c) shows the GNN’s QoS violation
[11] Z. Lin and Y. Liu, ‘‘Joint uplink and downlink transmissions in user-
level remains below 10 bps, surpassing by far other schemes, centric OFDMA cloud-RAN,’’ IEEE Trans. Veh. Technol., vol. 68, no. 8,
which worsen with higher σe . GNN’s consistent robustness, pp. 7776–7788, Aug. 2019.
attributed to its permutation invariant features, showcases [12] M. K. Shehata, S. M. Gasser, H. M. El-Badawy, and M. E. Khedr,
‘‘Optimized dual uplink and downlink resource allocation for multiple
its superior design in mitigating distortion and preserving class of service in OFDM network,’’ in Proc. IEEE Int. Symp. Signal
service quality. Process. Inf. Technol. (ISSPIT), Dec. 2015, pp. 597–601.
[13] D. H. N. Nguyen, L. B. Le, and Z. Han, ‘‘Optimal uplink and downlink
channel assignment in a full-duplex multiuser system,’’ in Proc. IEEE Int.
VI. CONCLUSION Conf. Commun. (ICC), May 2016, pp. 1–6.
In conclusion, we presented a novel GNN-based framework [14] R. Ruby, S. Zhong, H. Yang, and K. Wu, ‘‘Enhanced uplink resource
for jointly solving power control and spectrum allocation allocation in non-orthogonal multiple access systems,’’ IEEE Trans.
Wireless Commun., vol. 17, no. 3, pp. 1432–1444, Mar. 2018.
in a non-orthogonal wireless environment. Our approach [15] C. Kai, L. Xu, J. Zhang, and M. Peng, ‘‘Joint uplink and downlink resource
demonstrated superior performance in terms of average allocation for D2D communication underlying cellular networks,’’ in Proc.
sum rate and Qos preservation in different wireless setups 10th Int. Conf. Wireless Commun. Signal Process. (WCSP), Oct. 2018,
pp. 1–6.
compared to other heuristic and learnable approaches and
[16] Y. Pan, C. Pan, Z. Yang, and M. Chen, ‘‘Resource allocation for D2D
achieved robustness over an imperfect channel. Additionally, communications underlaying a NOMA-based cellular network,’’ IEEE
our approach exhibited scalability, stability, and general- Wireless Commun. Lett., vol. 7, no. 1, pp. 130–133, Feb. 2018.
ization, making it suitable for various network structures [17] P. Mach, Z. Becvar, and M. Najla, ‘‘Resource allocation for D2D
communication with multiple D2D pairs reusing multiple channels,’’ IEEE
with different setups, such as D2D networks and Downlink- Wireless Commun. Lett., vol. 8, no. 4, pp. 1008–1011, Aug. 2019.
Uplink cellular scenarios. This study establishes a foundation [18] M. Najla, Z. Becvar, and P. Mach, ‘‘Reuse of multiple channels by multiple
for advanced RRM in future wireless networks. Future D2D pairs in dedicated mode: A game theoretic approach,’’ IEEE Trans.
Wireless Commun., vol. 20, no. 7, pp. 4313–4327, Jul. 2021.
research should delve into GNNs’ capabilities for dynamic
[19] S. Liu, Y. Wu, L. Li, X. Liu, and W. Xu, ‘‘A two-stage energy-
spectrum allocation, interference management, and network efficient approach for joint power control and channel allocation in D2D
optimization in changing conditions. Additionally, in-depth communication,’’ IEEE Access, vol. 7, pp. 16940–16951, 2019.
theoretical analysis is needed to pinpoint the best graph [20] T. O’Shea and J. Hoydis, ‘‘An introduction to deep learning for the physical
layer,’’ IEEE Trans. Cognit. Commun. Netw., vol. 3, no. 4, pp. 563–575,
representations of wireless networks and fine-tune the GNN Dec. 2017.
embedding layer. Lastly, incorporating temporal dynamics [21] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, ‘‘LORM: Learning to optimize
into GNN training could further improve RRM outcomes. for resource management in wireless networks with few training samples,’’
IEEE Trans. Wireless Commun., vol. 19, no. 1, pp. 665–679, Jan. 2020.
[22] F. Liang, C. Shen, W. Yu, and F. Wu, ‘‘Towards optimal power control via
REFERENCES ensembling deep neural networks,’’ IEEE Trans. Commun., vol. 68, no. 3,
[1] Y. Zhao, J. Zhao, W. Zhai, S. Sun, D. Niyato, and K. Lam, ‘‘A survey pp. 1760–1776, Mar. 2020.
of 6G wireless communications: Emerging technologies,’’ in Advances [23] H. Sun, X. Chen, Q. Shi, M. Hong, X. Fu, and N. D. Sidiropoulos,
in Information and Communication (Advances in Intelligent Systems and ‘‘Learning to optimize: Training deep neural networks for interfer-
Computing), vol. 1363, K. Arai, Ed. Cham, Switzerland: Springer, 2021. ence management,’’ IEEE Trans. Signal Process., vol. 66, no. 20,
[2] M. Alsabah et al., ‘‘6G wireless communications networks: A comprehen- pp. 5438–5453, Oct. 2018.
sive survey,’’ IEEE Access, vol. 9, pp. 148191–148243, 2021. [24] W. Lee, M. Kim, and D. Cho, ‘‘Deep power control: Transmit power
[3] C. D. Alwis et al., ‘‘Survey on 6G frontiers: Trends, applications, control scheme based on convolutional neural network,’’ IEEE Commun.
requirements, technologies and future research,’’ IEEE Open J. Commun. Lett., vol. 22, no. 6, pp. 1276–1279, Jun. 2018.
Soc., vol. 2, pp. 836–886, 2021. [25] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, ‘‘Graph neural networks for
[4] M. Moussaoui, E. Bertin, and N. Crespi, ‘‘5G shortcomings and beyond- scalable radio resource management: Architecture design and theoretical
5G/6G requirements,’’ in Proc. 1st Int. Conf. 6G Netw. (6GNet), Jul. 2022, analysis,’’ IEEE J. Sel. Areas Commun., vol. 39, no. 1, pp. 101–115,
pp. 1–8. Jan. 2021.
[5] T. Akhtar, C. Tselios, and I. Politis, ‘‘Radio resource management: [26] S. He et al., ‘‘An overview on the application of graph neural networks in
Approaches and implementations from 4G to 5G and beyond,’’ Wireless wireless networks,’’ IEEE Open J. Commun. Soc., vol. 2, pp. 2547–2565,
Netw., vol. 27, no. 1, pp. 693–734, Jan. 2021, doi: 10.1007/s11276-020- 2021.
02479-w. [27] L. Ruiz, F. Gama, and A. Ribeiro, ‘‘Graph neural networks: Architectures,
[6] F. Qamar, M. U. A. Siddiqui, M. N. Hindia, R. Hassan, and Q. N. Nguyen, stability, and transferability,’’ Proc. IEEE, vol. 109, no. 5, pp. 660–682,
‘‘Issues, challenges, and research trends in spectrum management: A May 2021.
comprehensive overview and new vision for designing 6G networks,’’ [28] M. Lee, G. Yu, and G. Y. Li, ‘‘Learning to branch: Accelerating resource
Electronics, vol. 9, no. 9, p. 1416, Sep. 2020. allocation in wireless networks,’’ IEEE Trans. Veh. Technol., vol. 69, no. 1,
[7] J. Huang, S. Huang, C.-C. Xing, and Y. Qian, ‘‘Game-theoretic power pp. 958–970, Jan. 2020.
control mechanisms for device-to-device communications underlaying [29] Z. Zhang and M. Tao, ‘‘Learning-based branch-and-bound for non-
cellular system,’’ IEEE Trans. Veh. Technol., vol. 67, no. 6, pp. 4890–4900, convex complex modulus constrained problems with applications in
Jun. 2018. wireless communications,’’ IEEE Trans. Wireless Commun., vol. 21, no. 6,
[8] A. Ramezani-Kebrya, M. Dong, B. Liang, G. Boudreau, and pp. 3752–3763, Jun. 2022.
S. H. Seyedmehdi, ‘‘Joint power optimization for device-to-device [30] T. Lin and Y. Zhu, ‘‘Beamforming design for large-scale antenna arrays
communication in cellular networks with interference control,’’ IEEE using deep learning,’’ IEEE Wireless Commun. Lett., vol. 9, no. 1,
Trans. Wireless Commun., vol. 16, no. 8, pp. 5131–5146, Aug. 2017. pp. 103–107, Jan. 2020.