0% found this document useful (0 votes)
13 views52 pages

Paper 4

This paper presents a comprehensive survey on the application of Large Language Models (LLMs) in telecommunications, highlighting their potential to automate tasks and enhance network management for future 6G networks. It covers LLM fundamentals, key techniques, and various applications including generation, classification, optimization, and prediction problems. The work also identifies challenges and future directions for LLM integration in telecom, emphasizing the need for domain-specific training and deployment strategies.

Uploaded by

Mohit Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views52 pages

Paper 4

This paper presents a comprehensive survey on the application of Large Language Models (LLMs) in telecommunications, highlighting their potential to automate tasks and enhance network management for future 6G networks. It covers LLM fundamentals, key techniques, and various applications including generation, classification, optimization, and prediction problems. The work also identifies challenges and future directions for LLM integration in telecom, emphasizing the need for domain-specific training and deployment strategies.

Uploaded by

Mohit Rai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 52

This paper has been accepted by IEEE Communications Surveys and Tutorials.

Large Language Model (LLM) for


Telecommunications: A Comprehensive Survey on
Principles, Key Techniques, and Opportunities
Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili Jin,
Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Fellow, IEEE,
Charlie Zhang, Fellow, IEEE, Xianbin Wang, Fellow, IEEE, Jiangchuan Liu, Fellow, IEEE.
arXiv:2405.10825v2 [eess.SY] 16 Sep 2024

Abstract—Large language models (LLMs) have received con- rates, 107 /km2 connection densities, and lower than 0.1 ms
siderable attention recently due to their outstanding comprehen- latency [1]. To achieve these goals, the International Telecom-
sion and reasoning capabilities, leading to great progress in many munication Union (ITU) has defined six key use cases for
fields. The advancement of LLM techniques also offers promising
opportunities to automate many tasks in the telecommunication envisioned 6G networks [2]. Specifically, three cases are exten-
(telecom) field. After pre-training and fine-tuning, LLMs can sions of IMT-2020 (5G), namely immersive communication,
perform diverse downstream tasks based on human instructions, hyper-reliable and low-latency communication, and massive
paving the way to artificial general intelligence (AGI)-enabled 6G. communication, and the other three novel usage cases are
Given the great potential of LLM technologies, this work aims ubiquitous connectivity, integrated sensing and communica-
to provide a comprehensive overview of LLM-enabled telecom
networks. In particular, we first present LLM fundamentals, tion, and AI and communication. These novel techniques have
including model architecture, pre-training, fine-tuning, infer- shown satisfactory performance towards 6G requirements, but
ence and utilization, model evaluation, and telecom deployment. the complexity of network management also significantly
Then, we introduce LLM-enabled key techniques and telecom increased. From 3G, 4G LTE to 5G and envisioned 6G
applications in terms of generation, classification, optimization, networks, telecommunication (telecom) networks have become
and prediction problems. Specifically, the LLM-enabled gen-
eration applications include telecom domain knowledge, code, a complicated large-scale system, including core networks,
and network configuration generation. After that, the LLM- transport networks, network edge, and radio access networks
based classification applications involve network security, text, [3]. Moreover, 6G ubiquitous connectivity aims to address
image, and traffic classification problems. Moreover, multiple presently uncovered areas, e.g., rural and sparsely populated
LLM-enabled optimization techniques are introduced, such as areas, by integrating other access systems such as satel-
automated reward function design for reinforcement learning
and verbal reinforcement learning. Furthermore, for LLM-aided lite communications. In addition, 6G integrated sensing and
prediction problems, we discussed time-series prediction models communication is designed to improve applications requiring
and multi-modality prediction problems for telecom. Finally, we sensing capabilities, i.e., assisted navigation, activity detection,
highlight the challenges and identify the future directions of and environmental monitoring. Despite the potential benefits,
LLM-enabled telecom networks. such highly integrated network architecture and functions may
Index Terms—Large language model, telecommunications, gen-
eration, classification, prediction, optimization. lead to a huge burden on 6G network management, including
network configuration and troubleshooting, product design
I. I NTRODUCTION and coding, standard specification development, performance
optimization and prediction, etc.
While 5G networks have entered the commercial deploy-
To handle such complexity, machine learning (ML) has be-
ment stage, the academic community has started the explo-
come one of the most promising solutions, and there have been
ration of envisioned 6G networks. In particular, 6G networks
a large number of studies on artificial intelligence (AI)/ML-
are expected to achieve terabits per second (Tbps) level data
enabled wireless networks, e.g., reinforcement learning-based
Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Can Chen, Yili Jin, network management [4], deep neural network-enabled chan-
Haolun Wu, Dun Yuan, Li Jiang, and Xue Liu are with the School of nel state information (CSI) prediction [5], and federated learn-
Computer Science, McGill University, Montreal, QC H3A 0E9, Canada. ing for distributed model training in wireless environments [6].
(emails:{hao.zhou4, chengming.hu, ye.yuan3, yufei.cui, can.chen, yili.jin,
haolun.wu, dun.yuan, li.jiang3}@mail.mcgill.ca, [email protected]); For example, convex optimization has been applied to optimize
Di Wu is with the School of Electrical and Computer Engineering, McGill network performance, but it requires problem-specific trans-
University, Montreal, QC H3A 0E9, Canada. (email: [email protected]); formation for convexity. By contrast, reinforcement learning
Charlie Zhang is with Samsung Research America, Plano, Texas, TX 75023,
USA. (email: [email protected]); will transform the problem into a unified Markov decision
Xiangbin Wang is with the Department of Electrical and Computer En- process (MDP), and then interact with the environment to
gineering, Western University, London, ON N6A 3K7, Canada. (e-mail: explore optimal policies. Compared with conventional opti-
[email protected]);
Jiangchuan Liu is with the School of Computing Science, Simon Fraser mization algorithms [7], reinforcement learning overcomes the
University, Burnaby, BC V5A 1S6, Canada. (e-mail: [email protected]). complexity of dedicated problem reformulation, and can better

1
Fig. 1. Organization and key topics covered in this work.

handle environmental uncertainties, e.g., the growing diversity a huge amount of parameters have shown versatile compre-
of user preferences, and more distributed and heterogeneous hension and reasoning capabilities in various fields such as
resources in future telecom networks. These studies have health care [8], law [9], finance [10], education, and so on
demonstrated the importance of incorporating ML to improve [11]. For instance, Wu et al. introduced a BloombergGPT
the efficiency, reliability, and quality of telecom services. model that is trained on a wide range of financial data with 50
billion parameters, and the Med-PaLM2 developed by Google
Recently, large language model (LLM) techniques have at- achieves 86.5% correct rate on the medical question answering
tracted considerable interest from both academia and industry. dataset [8]. LLM technologies have many promising fea-
Unlike previous ML algorithms, these large-scale models with

2
tures such as in-context learning (ICL), step-by-step learning, [20], instruction-based optimization [21], network time-series
and instruction following [12]. Existing studies have shown prediction and decision making [22], etc. These LLM-inspired
that LLMs can answer telecom-domain questions, generate techniques have become crucial pillars of LLM studies, and
troubleshooting reports, develop project code, and configure exploring these techniques is crucial to take full advantage
networks, which will significantly lower the difficulty of 6G of LLM capabilities. Fig.1 presents the organization of this
ubiquitous connectivity management. Meanwhile, for 6G inte- work, in which the left side indicates the telecom scenarios and
grated sensing and communication, LLMs can understand and demand, and the right side shows the LLM-enabled techniques.
process multi-modal data, e.g., text, satellite or street camera To better present the detailed application scenarios, the bottom
images, 3D LiDAR maps and videos. It provides a promising of Fig.1 shows telecom environments that include radio access
approach to simulate and understand the 3D wireless signal networks, network edge, central cloud, and other network
transmission environment. elements such as regular users, malicious users, mmWave
Despite the great potential, LLM’s real-world application beam, environment image sensing, RISs, backhaul traffic, etc.
is still at a very early stage, especially for domain-specific Meanwhile, we categorize key telecom applications into gener-
scenarios. For instance, telecom is a broad field that includes ation, classification, optimization, and prediction problems to
various knowledge domains, e.g., signal transmissions, proto- better distinguish different scenarios and customized designs1
cols, network architectures, devices, and different standards. In particular, we focus on the following topics:
LLM is expected to properly understand and generate content 1) LLM fundamentals: Understanding LLM fundamentals
that aligns with real-world details and specific requirements is the prerequisite for developing advanced applications in
of telecom applications [13]. However, such specific telecom- telecom networks [11]. Compared with existing studies [15]–
related requirements are rare in the existing knowledge base of [17], this work presents a more comprehensive overview of
general-domain LLMs. Therefore, applying a general-domain the model architecture, pre-training, fine-tuning, inference and
LLM directly to telecom tasks may lead to poor performance. utilization, and evaluation. Additionally, it presents different
Meanwhile, fine-tuning LLMs on telecom datasets may im- approaches to deploy LLMs in telecom networks, such as
prove LLM’s performance of domain-specific tasks, but the central cloud, network edge, and mobile LLM [16]. It further
telecom-specific dataset collection and filtering still require analyzes LLM fundamentals from the telecom application per-
careful design and evaluation. In addition, many telecom tasks spective, e.g., training or fine-tuning telecom-specific LLMs,
require multi-step planning and thinking, e.g., a simple coding and the importance of prompting and multi-step planning
task can include multiple steps, indicating dedicated prompting techniques for telecom tasks.
and analyses from the telecom perspective [14]. 2) LLM for generation problems in telecom: Gener-
Given the above opportunities and challenges, this work ating desired content is the most common usage of LLM,
presents a comprehensive survey of LLM-enabled telecom and here we investigate the applications to specific telecom
networks. Different from existing studies that focus on one scenarios. In particular, it involves answering telecom-domain
specific aspect such as edge intelligence [15], [16], grounding questions, generating troubleshooting reports, project coding,
and alignment [17], this work provides a comprehensive and network configuration. It shows that LLM’s generation
survey on fundamentals, key techniques, and applications of capabilities are particularly useful in text and language-related
LLM-enabled telecom. To be specific, this work focuses on telecom tasks to save human effort, e.g., automated code
generative models that were originally developed for language refactoring and design [14], recommending troubleshooting
tasks, i.e., language models, and it also involves more diverse solutions [23], and generating network configurations [24].
techniques and broad application scenarios such as optimiza- 3) LLM-based classification for telecom: Classification is
tion and prediction problems. In this survey, the term “foun- a common task in the telecom field, and we present LLM-
dation models” refers to models specifically developed from enabled network attack classification and detection, telecom
scratch for applications that are beyond pure language-related text, image, and traffic classification problems. For instance,
tasks, such as the prediction foundation models in Section VII- there have been many studies on visioned-aided blockage
B, while “LLM-enabled” or “LLM-aided” approaches denote prediction and beamforming for 6G networks [25], and some
methods that repurpose existing pre-trained language models LLM can provide zero-shot image classification capabilities,
for telecom tasks. Moreover, when referring to LLMs, it means overcoming the training difficulties of conventional algorithms
that the inputs to the model are purely text, and the model in complicated signal transmission environments [26].
generates purely text as outputs, even if the model can accept 4) LLM-enabled optimization techniques: Optimization
inputs in other modalities, such as GPT-4V and GPT-4o. When techniques are of great importance to telecom networks, e.g.,
discussing the multi-modal inputs, we explicitly describe them resource allocation and load balancing [7], and LLM offers
as multi-modal large language models or multi-modal LLMs. new opportunities [7]. In particular, we introduce LLM-aided
Although LLM development is originally motivated by automated reward function design for reinforcement learning,
natural language tasks, it is worth noting that there have
1 Note that although the classification, optimization, and prediction capa-
been diverse state-of-the-art explorations that are beyond
bilities are all based on the LLM’s inference and generation capabilities, this
the conventional language processing tasks, e.g., coding and organization can significantly reduce the reader’s difficulty in understanding
debugging [18], recommendation [19], LLM-enabled agents the LLM’s potential for telecom applications.

3
verbal reinforcement learning, LLM-enabled black-box opti- to improve the LLM training efficiency at the network edge,
mizer, end-to-end convex optimization, and LLM-aided heuris- such as parameter-efficient fine-tuning, split edge learning, and
tic algorithm design. For example, reinforcement learning has quantized training.
been widely used for network optimization, but the reward Meanwhile, researchers have investigated various network
functions are usually manually designed with a trial-and-error application scenarios for LLM and generative AI (GAI), such
approach [27]. LLM can provide automated reward function as integrated satellite-aerial-terrestrial networks [32], secure
designs, and such an improvement can significantly promote physical layer communication [37], semantic communica-
reinforcement learning applications in the telecom field. tion [38], and vehicular networks [39]. For instance, Javaid
5) LLM-aided prediction in telecom: Prediction tech- et al. studied the application of LLMs to integrated satellite-
niques are crucial for telecom networks, such as CSI pre- aerial-terrestrial networks, including resource allocation, traffic
diction [5], prediction-based beamforming [25], and traffic routing, network optimization, etc [32]. Huang et al. presents a
load prediction [36]. Existing studies have started exploring general overview of LLM for networking, involving network
one-model-for-all time-series models. After pre-training on a design, configuration, and security [35]. Du et al. present a
large corpus of diverse time-series data, such a model can novel concept named “AI-generated everything”, discussing
learn the hidden temporal patterns, and then generalize well the interactions between (GAI) and different network lay-
across different prediction tasks without extra training. We ers [40]. In addition, sensing has become an important part
will first introduce how to pre-train foundation models, and of future 6G networks, and the multi-modal LLM are dis-
then present frozen pre-trained and fine-tune-based LLMs. In cussed in several existing studies, e.g., integrated sensing and
addition, the potential of multi-modal LLM is discussed for communication with LLM [17], [29], multi-modal input to
telecom prediction tasks. LLMs for intelligent sensing and communication [30], and
6) Challenges and future directions: Finally, we identify multi-modal sensing [31]. These studies are very valuable
the challenges and future directions of LLM-empowered tele- explorations of LLM-enabled telecom networks by focusing on
com. The challenges focus on telecom-domain LLM training, model training and deployment. However, LLM techniques are
practical LLM deployment, and prompt engineering for tele- rapidly progressing and many LLM-inspired novel techniques
com applications. The future directions include LLM-enabled and applications have been recently proposed. This work is
planning, model compression and fast inference, overcoming different from existing studies in the following aspects:
hallucination problems, retrieval augmented-LLM, and eco- 1) In terms of LLM fundamentals, we provide compre-
nomic and affordable LLMs. hensive overviews and analyses, ranging from model archi-
In summary, the main contribution of this work is that we tecture and pre-training to LLM evaluation and deployment.
provide a comprehensive survey of the principles, key tech- For instance, prompt engineering is of great importance for
niques, and applications for LLM-enabled telecom networks, using LLM technology, but some crucial techniques such as
ranging from LLM fundamentals to novel LLM-inspired gen- chain-of-thought (CoT) [41] and step-by-step planning are
eration, classification, optimization and prediction techniques not discussed in many existing studies [15]–[17], [28]–[31].
along with telecom applications. This work covers nearly Understanding these prompt design skills is the prerequisite
20 telecom application scenarios and LLM-inspired novel for advanced telecom applications. By contrast, this work
techniques, aiming to be a roadmap for researchers to use provides detailed analyses of chain-of-thought along with
LLMs to solve various telecom tasks. The rest of this paper is telecom applications, e.g., LLM-aided automated wireless
organized as Fig. 1. Section II discusses related surveys, and project coding with multi-step prompting and thinking [14].
Section III presents LLM fundamentals. Sections IV, V, VI, Meanwhile, we also systemically analyzed the features of
and VII focus on generation, classification, optimization, and different LLM deployment strategies in telecom, while existing
prediction problems and telecom applications, respectively. studies usually involve one single deployment [15]–[17], [31].
Finally, Section VIII identifies the challenges and future di- 2) In terms of LLM-inspired techniques, this work presents
rections, and Section IX concludes this work. the most state-of-the-art novel algorithms and designs. For
instance, reinforcement learning has been widely applied to
II. R ELATED S URVEYS telecom optimization problems, but the reward function design
Table I compares this work with existing studies [15]–[17], requires considerable human effort [27]. Existing studies have
[28]–[35], including LLM fundamental techniques such as pre- shown that LLM can be used for automated reward function
training and fine-tuning, and other key topics ranging from design, achieving a comparable performance as human manual
question answering to multi-modality. Firstly, Table I shows designs [42]–[44]. Such a technique may bring revolutionary
that most existing studies focus on the fundamental techniques changes to reinforcement learning techniques, which have
of LLMs, e.g., pre-training LLMs for telecom tasks in general great potential for telecom applications. In addition, time-
[15], [16], [32]–[35]. LLM deployment is discussed in many series LLM is also a promising technique for telecom, enabling
existing studies, including central cloud [15], [31], network one-model-for-all prediction [45]. However, these novel tech-
edge [16], and mobile execution [17]. Due to the storage and niques are not mentioned in most existing studies.
computational resources constraint at the network edge, Lin et 3) In terms of telecom applications, we systematically sum-
al. also summarized various techniques in [16] that can be used marize various LLM application scenarios, including question

4
TABLE I
C OMPARISON OF THIS WORK WITH EXISTING SURVEYS

Prediction
LLM fundamental techniques Generation applications Classification applications Optimization techniques
techniques
Ref.
Archit- Pre- Fine- Infe- Evalu- Deploy- Question Troubles- Network Network Network LLM Black- Time series Multi-
Coding Text Image Convex Heuristic
ecture training tuning rence ation ment answering hooting config. attacks traffic -aided RL box LLM modality

[15] ✓ ✓
[16] ✓ ✓
[17] ✓ ✓
[28] ✓ ✓ ✓ ✓ ✓ ✓
[29] ✓ ✓ ✓
[30] ✓ ✓
[31] ✓ ✓ ✓
[32] ✓ ✓ ✓ ✓ ✓
[33] ✓ ✓ ✓ ✓
[34] ✓ ✓ ✓ ✓ ✓
[35] ✓ ✓ ✓ ✓
Our
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
work

1 Multi-modality is discussed in several existing studies but not from the prediction perspective. Table I divides key topics from LLM fundamentals to
optimization and prediction to better align with the organization of our work.

TABLE II specific LLMs, demonstrating that LLMs have received con-


S UMMARY OF EXISTING GENERAL AND DOMAIN - SPECIFIC LLM S . siderable attention across many fields. Researchers have
Domain Model Size Pre-train Latest Update trained various domain-specific LLMs for their application
GPT-4-Turbo - - Mar 2024 scenarios, including healthcare [46], finance [10], time series
Claude-3 Opus - - Mar 2024 [47], autonomous driving [48], and recommendation systems
Gemini-1 Ultra - - Dec 2023
[49], etc. For instance, DriveGPT4 is designed to provide
Mistral-Large - - Feb 2024
General
Llama-2 70B 10T tokens Jul 2023
interpretable end-to-end autonomous driving [48]. LLM is also
Qwen-1.5 72B - Feb 2024 used for the automated design of reward functions in robot
DeepSeek 67B 2T tokens Jan 2024 control [44], achieving better performance than human manual
Baichuan-2 Turbo 13B - Sep 2023 designs. Thus, given the rapid progress and great potential of
MedGPT - - Jul 2021
LLMs, a comprehensive survey is expected to summarize the
Healthcare ChatDoctor 7B 100K Jun 2023
Med-PaLM 540B 760B Dec 2022 latest and potential applications of LLMs in the telecom field.
finBERT 110B 1.8M Aug 2019 To this end, this work answers such a question: What are
Finance FinMA 30B 1T tokens Jun 2023 the most state-of-the-art techniques inspired by LLMs, and
BloombergGPT 50B 569-770B tokens Dec 2023
how can these techniques be used to solve telecom domain
TabLLM 3B 50,000 rows Mar 2023
Time Series LLMTime 70B - Oct 2023
problems? The answer to this question is crucial for building
TIME-LLM 7B - Jan 2024 intelligent next-generation telecom networks.
Driving with LLMs 7B 110k Oct 2023
Autonomous
Dilu - - Feb 2024 III. LLM F UNDAMENTALS
driving
DriveGPT4 70B 112k Mar 2024
LexiLaw 6B - May 2023
This Section will introduce LLM fundamentals, and the
Law JurisLMs 13B - July 2023 overall organization is shown in Fig.2. It presents a thor-
ChatLaw 13B 980k July 2023 ough overview of LLM fundamentals, including the model
Recommen-
M6-Rec 300M 1G May 2022 architecture, pre-training, fine-tuning, inference and utilization,
TallRec 7B 100 samples Oct 2023
dation and model evaluation. We further discuss how LLMs can be
AgentCF 175B 20k samples Oct 2023
deployed in telecom networks such as central cloud, network
edge, and mobile devices. Finally, we analyze LLM fundamen-
tals from the telecom application perspective, e.g., training or
answering, network troubleshooting, coding, network config- fine-tuning LLMs for the telecom domain.
uration, network attack classification and security, text and
image classification, etc. Compared with existing studies, A. Model Architecture
we presented more comprehensive overviews and analyses The fundamental component of contemporary LLMs is
of using LLM techniques to solve various problems in the the transformer scheme [50], which leverages an attention
telecom domain. For each application, this work provides mechanism to capture global dependencies between inputs
technical details such as framework, pre-training steps, and and outputs. Transformers process raw inputs by tokenizing
prompt designs, which are more informative than existing them and applying embeddings and positional encodings. The
studies that focus on general system-level designs. vanilla transformer architecture comprises two main compo-
Moreover, Table II summarizes various general and domain- nents: the encoder and the decoder. The encoder’s role is to

5
extract features and understand the relationships among all
input tokens. It employs self-attention, also referred to as bidi-
rectional attention, allowing each token to attend to every other
token in both directions. Conversely, the decoder is responsible
for producing the output sequence while taking into account
the input sequence and previously generated tokens. It initially
applies a masked attention mechanism, known as causal atten-
tion, ensuring that the current token only attends to previously
generated tokens. Additionally, the decoder employs cross-
attention, where the query comes from the decoder, and the
key and value are from the encoder, enabling the decoder to
integrate information from both the input sequence and the
already generated tokens. Beyond the basic version of the
attention mechanism, various variants are developed to capture
the different relationships among tokens, such as multi-head
attention [50], multi-query attention [51], and grouped-query
attention [52]. Current architectures can be classified into three
distinct categories: encoder-only architecture, encoder-decoder
architecture, and decoder-only architecture.
1) Encoder-only architecture: Models with an encoder-
only structure solely comprise an encoder. These models
are tailored for language understanding tasks, where they
Fig. 2. Organization and key topics of Section III.
extract language features for downstream applications such as
classification. A prominent example is bidirectional encoder
representations from transformers (BERT) [53]. BERT is pre- attention mechanisms into causal and non-causal decoders.
trained with two main objectives: the masked language model In causal decoders, every token is restricted to attending
objective, which aims to reconstruct randomly masked tokens, to its past tokens and itself; in non-causal decoders, prefix
and the next sentence prediction objective, designed to as- tokens can attend to all tokens within the prefix. Causal
certain if one sentence logically follows another. There have decoders are predominantly adopted in popular LLMs, such
been many variants of this model, such as RoBERTa [54], as the GPT series [58], PaLM [59], and LLaMA [60]. Non-
which enhances the performance on downstream tasks2 , and causal decoders [61] resemble encoder-decoder frameworks in
ALBERT [55] that introduces two parameter-reduction tech- their ability to bidirectionally process the prefix sequence and
niques to accelerate BERT’s training process. autoregressively generate output tokens sequentially.
2) Encoder-decoder architecture: The foundational trans- Note that LLM is a complicated system, and there are mul-
former block employs an encoder-decoder architecture, tiple approaches to apply LLMs to the telecom field, ranging
wherein the encoder relays keys and values generated by from pre-training and fine-tuning, to prompting. For instance,
its self-attention module to the decoder for cross-attention pre-training an LLM from scratch by using telecom-domain
processing. For example, the study in [56] introduces the datasets, fine-tuning a general domain LLM for specific tele-
text-to-text transfer transformer, a unified framework that com tasks, or using general domain LLMs by prompting
reformulates all text-based language tasks into a text-to-text directly. The following will introduce the key procedures and
format, thereby facilitating the exploration of transfer learning features of each approach, in which Sections III-B, III-C, and
within natural language processing (NLP). BART is another III-D introduce the procedures of pre-training, fine-tuning, and
well-known model with standard transformer architecture [57], prompting, respectively.
which employs a denoising autoencoder approach for pre-
training sequence-to-sequence models. It introduces arbitrary B. LLM Pre-training
noise into text and is trained to reconstruct the original con- The aim of pre-training language models is to predict the
tent, effectively combining elements of BERT’s bidirectional next word within a sentence. After being trained on extensive
encoding and GPT’s causal decoding methodologies. datasets, LLMs exhibit emergent capabilities in comprehen-
3) Decoder-only architecture: Decoder-only architectures sion and reasoning. This subsection will introduce dataset
specialize in unidirectional attention, allowing each output collection, preprocessing, and model training techniques.
token to attend only to its past tokens and itself. Both prefix 1) Dataset collection and preprocessing: Datasets for
and output tokens undergo identical processing within the training language models fall into two primary categories:
decoder. Decoders are further distinguished based on their general and specialized. General datasets comprise a diverse
range of sources, such as web pages, literature, and con-
2 Here downstream tasks refer to a series of target tasks that can be solved versational corpora. For instance, web pages like Wikipedia
by the pre-trained model, e.g., text classification, natural language inference. [62] can contribute to a language model’s broad linguistic

6
understanding. Meanwhile, literary works also serve as a rich 1) Instruction tuning: Instruction tuning is a method for
reservoir of formal and lengthy texts [63]. These materi- fine-tuning pre-trained LLMs using a collection of natural
als are crucial for teaching LLMs complex linguistic con- language-formatted instances. This technique aligns closely
structs, facilitating the modelling of long-range dependencies. with supervised fine-tuning and multi-task prompted train-
Specialized data involves scientific texts and programming- ing, enhancing the LLM’s ability to generalize to unseen
related data. For example, scientific literature comprises a tasks, even in multilingual contexts [71]. The process involves
wealth of formal writing imbued with domain-specific knowl- collecting or constructing instruction-formatted instances and
edge, encompassing academic papers and textbooks. On the employing these to fine-tune LLMs in a supervised manner,
other hand, programming data drawn from online question- typically using sequence-to-sequence loss for training. Models
answering platforms like Stack Exchange [64], along with like InstructGPT and GPT-4 have demonstrated the effective-
public software repositories such as GitHub, provide raw ma- ness of instruction tuning in meeting real user needs and
terial rich with code snippets, comments, and documentation. improving task generalization [72], [73]. Instruction-formatted
Incorporating these specialized texts into the training of LLMs instances usually consist of a task description, an optional
can significantly improve LLM’s performance in reasoning input, a corresponding output, and possibly a few examples
and domain-specific knowledge applications. However, before as demonstrations. These instances can originate from various
pre-training, it is critical to preprocess the collected datasets, sources, such as traditional NLP task datasets, daily chat data,
which often contain noisy, redundant, irrelevant, and poten- and synthetic data. Existing research has reformatted tradi-
tially harmful data. The preprocessing procedure may include tional NLP datasets with natural language task descriptions to
quality filtering, de-duplication [59], privacy redaction [65], aid LLMs in understanding tasks, proving particularly effective
and tokenization [66]. in enhancing task generalization capabilities [71]. The design
2) Model training: In the model training process, two and quality of instruction instances significantly will impact
pivotal hyperparameters are the batch size and the learning the model’s performance. Scaling the instructions, for instance,
rate. For the pre-training of LLMs, a substantial batch size is tends to improve generalization ability up to a certain point,
required, and recent studies suggest incrementally enlarging beyond which additional tasks may not yield further gains [74].
the batch size to bolster the stability of the training pro- Diversity in task descriptions and the number of instances
cess [59]. In terms of learning rate adjustments, a widely per task are also critical, with a smaller number of high-
used strategy is to start with a warm-up phase and then quality instances often sufficing for significant performance
succeed with a cosine decay pattern. This approach helps improvements [71].
in achieving a more controllable learning rate schedule. To 2) Alignment tuning: Alignment tuning aims to ensure
enhance the training scalability, several key techniques are LLMs adhere to human values, preventing outputs that could
proposed. For instance, 3D parallelism encompasses data be harmful, biased, or misleading. This concept emerges from
parallelism, pipeline parallelism, and tensor parallelism. Data the realization that while the LLM excels in various NLP
parallelism involves the replication of the model’s parameters tasks, they may inadvertently generate content that deviates
and optimizer states across multiple GPUs [67], allocating from ethical norms or human expectations [58]. Collecting
to each GPU a subset of data to process and subsequently human feedback is central to the alignment-tuning process. In
aggregate the computed gradients. Pipeline parallelism, as particular, it involves curating responses from diverse human
detailed in [68], assigns distinct layers of an LLM to various labellers to guide the LLM toward generating outputs that
GPUs, allowing the accommodation of larger models within align with the predefined criteria. Approaches to collecting
the confines of GPU memory. Tensor parallelism operates on this feedback include ranking-based methods, where labellers
a similar premise by decomposing the tensors [69], especially evaluate the quality of model-generated outputs, and question-
for the parameter matrices of LLM, facilitating the distribution based methods, where labellers provide insights on specific
and computation across multiple GPUs. Meanwhile, ZeRO aspects of the outputs, such as their ethical implications [75].
is also a useful technique [70], which conserves memory by A prominent technique in alignment tuning is reinforcement
retaining only a portion of the model’s data on each GPU. The learning from human feedback (RLHF), where the model is
remainder of the data is accessible across the GPU network, as fine-tuned using reinforcement learning algorithms based on
needed, effectively addressing memory redundancy concerns. human feedback. This process typically starts with supervised
fine-tuning using human-annotated data, followed by training
C. LLM Fine-tuning a reward model that reflects human preferences, and finally,
Fine-tuning refers to the process of updating the parameters fine-tuning the LLM using this reward model. Despite its
of pre-trained LLMs to adapt to domain-specific tasks. Al- effectiveness, RLHF can be computationally intensive and
though the pre-trained LLM already has vast language knowl- complex, necessitating alternative approaches for practical
edge, they lack specialization in specific areas. Fine-tuning applications [76]. An alternative method for RLHF is direct
overcomes this limitation by allowing the model to learn from optimization through supervised learning, which bypasses the
domain-specific datasets, making the LLM more effective on complexities of reinforcement learning. This method relies on
specific applications. This subsection will introduce two fine- constructing a high-quality alignment dataset and directly fine-
tuning strategies: instruction and alignment tuning. tuning LLMs to adhere to alignment criteria. Although less

7
resource-intensive than RLHF, this approach requires careful and symbolic reasoning, by incorporating intermediate reason-
dataset construction and may not capture the full range of ing steps into prompts [82]. Differing from ICL’s input-output
human values and preferences as effectively as RLHF [77]. pairing, CoT prompting enriches prompts with sequences of
Additionally, researchers have introduced the direct preference reasoning steps, guiding LLMs to bridge between questions
optimization (DPO) technique [78], which eliminates the need and answers more effectively. Initially proposed as an ICL
for a reward model and allows the model to align directly with extension, CoT augments demonstrations from mere input-
preference data. Some advanced LLMs, like Llama 3, utilize output pairs to sequences comprising inputs, intermediate
both RLHF with proximal policy optimization (PPO) and reasoning steps, and outputs [82]. These steps help LLMs
DPO. However, both RLHF and DPO depend on high-quality navigate complex problem-solving more transparently and
human preference data, which is limited and costly to acquire. logically, though they typically require manual annotation.
To address this challenge, methods such as Constitutional However, creative phrasings such as “Let’s think step by step”
AI [79] and RL from AI feedback (RLAIF) [80] have been can trigger LLMs to generate CoTs autonomously, which
developed to generate preference data using LLMs, enabling significantly simplifies the CoT implementation.
models to learn from AI feedbacks and facilitating knowledge Despite improvements, CoT prompting faces challenges
transfer between models. such as incorrect reasoning and instability. The enhancement
strategies include better prompt design by utilizing diverse and
D. LLM Inference and Utilization by Prompting complex reasoning paths, advanced generation strategies, and
Prompt engineering is the process in which users design verification-based methods. These methods address generation
various inputs for AI models to generate desired outputs. issues by exploring multiple paths or validating reasoning
Compared with fine-tuning, prompting has no requirements steps, thus improving result accuracy and stability. Further-
for extra training, producing output instantly based on user more, extending beyond linear reasoning chains, recent studies
inputs. It indicates a straightforward approach to using LLMs, propose tree- and graph-structured reasoning to accommodate
and the rapid response and training-free features make it a more complex problem-solving processes [88]. In addition,
promising method for telecom applications. This subsection CoT prompting significantly benefits large-scale LLMs (over
will introduce key techniques in prompt engineering, including 10B parameters) and tasks requiring detailed step-by-step
ICL, CoT prompting, LLM for complex planning, and self- solutions. However, it may underperform in simpler tasks or
refinement with iterative feedback. The comparisons among when traditional prompting is already effective [82].
different prompt engineering techniques are shown in Fig. 3. 3) Planning for complex task solving: While ICL and
1) In-context learning (ICL): ICL, first introduced with CoT prompting provide a straightforward approach for task
GPT-3 [58], utilizes formatted natural language prompts and solving, they often fall short in complex scenarios like math-
integrates task descriptions and examples to guide LLMs in ematical reasoning and multi-hop question answering [89].
task execution. This approach allows the LLM to recognize To this end, prompt-based planning has emerged, breaking
and perform new tasks by leveraging contextual information. down intricate tasks into smaller and manageable sub-tasks and
The design of demonstrations is critical for ICL, encompass- outlining action sequences for their resolution. The planning
ing the selection, format, and order of examples. The for- framework for LLMs encompasses three main components:
mat of demonstrations involves converting selected examples the task planner, the plan executor, and the environment. The
into a structured prompt, integrating task-specific information task planner devises a comprehensive plan to address the target
and possibly incorporating reasoning enhancements like CoT task, which could be represented as a sequence of actions or an
[74], [82]. The ordering addresses LLM biases, arranging executable program [90]. This plan is then carried out by the
demonstrations based on similarity to the query or employing plan executor, which can range from text-based models to code
information-theoretic methods to optimize information con- interpreters, within an environment that provides feedback on
veyance [83], [84]. ICL’s underlying mechanisms include task the execution results [88], [91]. In plan generation, LLMs
recognition and task learning, and then LLMs can use pre- can utilize text-based approaches to produce natural language
trained knowledge and structured prompts to infer and solve sequences or code-based methods for generating executable
new tasks. Task recognition involves LLMs identifying the programs, enhancing the verifiability and precision of the
task type from the provided examples, leveraging pre-existing planned actions [91]. Feedback acquisition follows, where the
knowledge from pre-training data [85]. Task learning, on LLM evaluates the plan’s efficacy through internal assessments
the other hand, refers to LLMs acquiring new task-solving or external signals, refining the strategy based on outcomes
strategies through the given demonstrations, a capability that from different environments [88]. In addition, the refinement
becomes more pronounced with increasing model size [86]. process is crucial for optimizing the plan based on received
Recent studies suggest that larger LLMs exhibit an enhanced feedback, and the corresponding methods include reasoning,
ability to surpass prior knowledge and learn from the demon- backtracking, memorization, etc [88].
strations provided in ICL settings [87]. 4) Self-refinement with iterative feedback: Considering
2) Chain-of-thought (CoT) prompting: CoT prompting that LLMs may not generate correct answers initially, self-
is an advanced strategy to enhance LLM’s performance on refine has recently emerged to improve their outputs through it-
complex reasoning tasks, such as arithmetic, commonsense, erative feedback and refinement. Among these studies, Madaan

8
Fig. 3. Comparison of various prompt engineering techniques [11], [81]. Different from task-specific examples in ICL demonstrations, CoT prompting
additionally incorporates intermediate reasoning steps in demonstrations. Prompt-based planning breaks down intricate tasks into manageable sub-tasks and
outlines action sequences for their resolution. Self-refine enhances LLM’s outputs through iterative feedback and refinement.

et al. [81] first use an LLM to generate initial outputs, and Then, the efficiency of LLMs indicates the computational
then employ the same LLM to provide specific feedback on resources required for training and inference, as well as the
these outputs. Note that this feedback is actionable, containing speed at which these models can generate responses. As the
concrete steps to further improve the initial outputs. With size of LLM expands, this expansion leads to significant
such specific and actionable feedback, the same LLM can issues regarding environmental sustainability and the ease of
iteratively refine its outputs until performance converges. Fur- access to these technologies [97], [98]. Evaluating an LLM’s
thermore, Hu et al. [92] propose a self-refined LLM (named efficiency involves a detailed assessment of its performance
TrafficLLM) specifically designed for communication traffic relative to the consumed resources. Key metrics for this
prediction, which leverages in-context learning to enhance assessment include the energy usage during operations, the
predictions through a three-step process: traffic prediction, time it takes to process information, and the financial burden
feedback generation, and prediction refinement. Following the associated with acquiring and maintaining the necessary hard-
comprehensive feedback, refinement demonstration prompts ware infrastructure. Additionally, it’s important to consider the
enable the same LLM to refine its performance on target tasks. efficiency of data usage during training, as optimizing data can
reduce computational requirements [99].
E. Evaluation metrics of LLM The last metric is human alignment. Manual evaluation
Evaluating the performance of LLMs is a multifaceted for LLM alignment to human values generally offers a more
task and receives increasing attention. This subsection focuses holistic and precise assessment compared to automated eval-
on the evaluation metrics that encompass various dimen- uation [100]. This is supported by numerous studies, such
sions, including accuracy, hallucination, efficiency, and human as [101], [102], which incorporate human alignment evaluation
alignment etc. Each of these aspects plays a crucial role to provide a more in-depth analysis of their methods’ perfor-
in determining the overall applicability of LLMs in real- mance. Human alignment assesses the degree to which the lan-
world scenarios such as telecom networks. Firstly, accuracy guage model’s output aligns with human values, preferences,
is paramount in evaluating LLM technologies as it directly and expectations. It also considers the ethical implications of
impacts the model’s reliability and trustworthiness. It measures the generated content, ensuring that the language model pro-
how well an LLM can understand and process natural language duces text that respects societal norms and user expectations,
queries, generate relevant and correct responses, and perform promoting a positive interaction with human users.
specific tasks like translation, summarization, or question-
answering. Benchmarks and standardized datasets are often F. LLM Deployment in Telecom Networks
used to quantitatively evaluate the model’s accuracy. Practical deployment is the prerequisite for advanced appli-
Secondly, hallucination refers to instances where the LLM cations of LLM technologies in telecom networks. In particu-
generates incorrect or factual inconsistent information, often lar, it indicates how LLMs can be deployed within the current
presenting it with a high degree of confidence. This phe- telecom network architecture, e.g., central cloud, network
nomenon can significantly undermine the credibility of LLM- edge or even user devices. The LLM has great demands
generated content. Evaluating an LLM’s tendency to halluci- for computational and storage resources. For instance, GPT-
nate involves analyzing the model’s responses for factual ac- 4 has 1.76 trillion parameters and the model size is 45 GB
curacy, consistency, and relevance to the input prompt. Recent [73], posing a heavy burden on network storage capacities.
studies show that traditional automatic metrics for summa- Fine-tuning an LLM with 7 billion parameters, such as GPT-
rization such as ROUGE [93] and BERTScore [94] show sub- 4-LLM [109], could take nearly 3 hours on an 8×80GB
optimal performance on factual consistency measurement [95]. A100 machine, which is extremely time-consuming [110]. In
Thereafter, some novel metrics have been proposed to detect addition, the inference time of LLMs will also contribute to
hallucination errors, such as AlignScore in [96]. overall network latency, which is related to hardware support,

9
TABLE III
S UMMARY OF LLM DEPLOYMENT STRATEGIES

LLM
deployment Main features & Advantages Potential issues & Difficulties
strategies
Cloud deployment is the most straightforward method Cloud deployment indicates higher end-to-end latency for implementing
for LLM deployment. LLMs are usually user requests, since the inquiries have to be first uploaded and then
Cloud
computationally demanding, and cloud servers can processed and downloaded. It may prevent the application of some
deployment
provide abundant computational and storage resources latency-critical applications such as robot control, vehicle-to-vehicle
for model training, fine-tuning, and inference. communications, unmanned aerial vehicle (UAV) control, etc.
Network edge deployment can be an appealing Network edge servers are usually resources-constrained, indicating
approach to shorten the response time and save limited computational and storage resources for LLM fine-tuning and
Network edge backhaul bandwidth to the central cloud. It enables inference. Therefore, some techniques may be exploited, e.g., efficient
deployment rapid user request processing at edge servers or parameter-efficient fine-tuning [103], split edge learning [104], and
cloud, achieving shorter end-to-end delay than the quantized training [105]. In addition, model compression is also a
central cloud-based approach. promising direction for edge LLM deployment.
Despite the great advantages, on-device LLMs are still in the very early
On-device LLM is considered a very promising
stages, and the main challenge is to overcome the very limited
direction to deploy LLMs directly at user devices. It
computational and storage resources at user devices. Apple has proposed
enabled customized LLMs based on specific user
On-device a technique to store LLM parameters on flash memory [106] and achieve
requests. Meanwhile, on-device LLMs have the
deployment a 20 times faster inference speed. Qualcomm also announced a new
lowest service latency by processing tasks locally.
mobile platform to support popular small-scale LLMs [107]. Therefore,
Therefore, it has great potential for implementing
how to utilize limited computation resources to achieve faster inference
real-time tasks.
is the key to on-device LLM deployment.
Cached-based approach is proposed by [16] based on
Such a distributed deployment approach is promising to save the model
mobile edge computing architecture. Specifically, the
store and migration cost. However, compared with on-device deployment,
Cache-based authors propose to store the full-precision parameters
the cache-based method also requires complicated coordination strategies
deployment in the central cloud, quantized parameters in the edge
for model update and synchronization, e.g., model update and
cloud, and the frozen parameters at the user devices,
synchronization frequency and the quantization bit version selection.
enabling more flexible model training and migration.
Cooperative deployment is proposed in [108], which The cooperative deployment is a feasible solution to connect small-scale
involves the interactions between local small models local LLMs to large cloud models. However, the local model updating
and cloud-based large models. In particular, it frequency should also be carefully determined to reduce the burden on
Cooperative
assumes that the local model can collect and submit cloud LLMs. In addition, note that the inference is still updated locally,
deployment
sensor data selectively to the large model, and the and therefore the required computational resources are still challenges.
large model will update the small-scale local models To this end, it may be combined with on-device LLMs to address the
based on its domain-specific knowledge. resource issues.

batch size, parallelism, model pruning, etc [111]. Therefore, it download the LLM’s output [16]. The long response time may
is of great importance to deploy LLMs appropriately to better prevent the applications on latency-critical tasks, e.g., vehicle-
serve the telecom network demand. We summarize the existing to-vehicle networks and unmanned aerial vehicle control. In
deployment schemes in Table III, including cloud, network addition, the frequent multimodal information exchange, such
edge, on-device, cache-based, and cooperative deployment. We as images and videos, between end users and cloud LLM will
present the details of each strategy as the following: lead to extra bandwidth costs.
1) Cloud deployment: : Considering LLM’s high demand
for computational and storage resources, deploying LLMs in 2) Network edge deployment:: Here network edge refers
the central cloud is a straightforward solution, which can to edge cloud or BSs that are closer to users than central cloud.
provide substantial computational resources to support the Network edge deployment can be an appealing approach
fine-tuning and inference of LLMs [15]. Shen et al. investigate to shorten the response time and save bandwidth. However,
LLM-enabled autonomous edge AI [15], in which the network compared with the central cloud, network edge devices usually
edge devices can send the user request and datasets feature to have limited computational and storage capacities. To this end,
LLMs in the cloud, and then the LLM can send back the multiple techniques can be exploited. For the storage capacity
task planning and AI model configuration to network edge challenge, parameter sharing and model compression may be
devices through the backhaul. After that, the network edge and applied. In particular, LLMs for different downstream tasks
user devices can collaborate to make edge inferences. Cloud may share the same parameters, which can be exploited to save
deployment can easily adapt to existing telecom network the storage capacity. On the other hand, other technologies
architecture, and a few pieces of extra hardware are needed may be applied to reduce the computational resources demand
since the LLM is deployed in the virtual cloud. However, in fine-tuning and inference, including parameter-efficient fine-
cloud deployment suffers from long response time and high tuning [103], split edge learning [104], and quantized training
bandwidth costs since all data has to be transmitted to the [105]. With these techniques, deploying LLMs at the network
cloud, and then LLM will process the request and finally edge becomes a practical strategy.

10
Fig. 4. Illustration of different LLM deployment strategies.

3) On-device deployment:: There are multiple benefits capacities of the edge cloud, reducing the latency caused by
of deploying LLMs on user-side mobile devices, e.g., fast full model migration. However, the cache-based method may
responses and local customization based on the user’s specific require complicated coordination strategies for model update
requirements. However, such a deployment is also challenging and synchronization, e.g., model update and synchronization
since LLMs are usually storage- and computation-intensive. frequency and the quantization bit version selection.
Xu et al. introduced a split learning approach based on col-
5) Cooperative deployment: Lin et al. proposed a novel
laborative end-edge-cloud computing, aiming to deploy LLM
EdgeFM approach in [108]. In particular, the edge devices
agents at mobile devices and network edge [17]. Specifically,
will collect the sensor data from the environment, and then
the authors assume that LLMs with less than 10B parameters
the local model can evaluate the uncertainty features of the
such as LLAMA-7B can operate on mobile devices, providing
collected data and the real-time network dynamics. After that,
real-time inference services. Meanwhile, LLMs with more
the local EdgeFM model will selectively upload the unseen
than 10B parameters such as GPT-4 are deployed on network
data classes to query large models in the cloud, and the
edge servers, using global information and historical memory
large models can periodically update a customized small-scale
to assist the mobile LLM in processing complex tasks. Such
model at the network edge. Therefore, when the network
a collaboration enables higher flexibility by exploiting mobile
environment changes, at the early stage, the local model can
LLMs. However, the study of on-device LLM is still in a very
frequently query large models in the cloud, and then it can
early stage, and it requires considerable efforts to prove the
execute customized small models on edge devices at the late
feasibility of such a design [112]. For instance, Apple has pro-
stage. Such a cooperative deployment can reduce the system
posed a technique to store LLM parameters on flash memory
overhead, and enable dynamic customization of local small
[106], achieving a 20 times faster inference speed than using
models for edge devices. The experiment in [108] shows
GPU with limited dynamic random-access memory (DRAM)
3.2x lower end-to-end latency and achieve 34.3% accuracy
capacity. Similarly, Qualcomm has recently announced the
improvement than the baseline.
Snapdragon 8s Gen 3 mobile platform, which supports popular
small-scale LLM such as Llama 2 and Gemini Nano [107]. Finally, Fig. 4 illustrates different LLM deployment strate-
These studies may pave the way to effective inference of on- gies. Note that LLM’s requirements for storage and compu-
device LLMs. tational resources are the main motivations for developing
various deployment strategies. For instance, the model size
4) Cache-based deployment: Lin et al. proposed a cache- of Llama3-8b is around 5 GB, and therefore it is possible to
based method in [16], which utilizes the mobile edge com- be implemented at the network edge or even user devices, i.e.,
puting architecture to store, cache, and migrate models in Snapdragon 8s Gen 3 mobile platform recently developed by
edge networks. Specifically, they propose to store the full- Qualcomm. Similarly, Gemini Nano is less than 2 GB, and
precision parameters in the central cloud, quantized parameters such a small size allows on-device deployment, e.g., Google
in the edge cloud, and finally the frozen parameters at the plans to load Gemini Nano to its Pixel 8 smartphones. By
user devices. Such a separate model caching enables more contrast, large-scale LLMs require much more computation
flexible model training and migration. For instance, edge resources. For example, inference with Llama3-70b consumes
clouds or servers can apply low-precision computation by at least 140 GB of GPU RAM. Using 2-bit quantization,
using quantized training, improving the edge training speed the Llama3-70b can be implemented on a 24 GB consumer
with limited computational resources. In addition, storing GPU, but such a low-precision quantization will significantly
the frozen parameters on user devices can save the storage degrade the model accuracy. To this end, hybrid deployment

11
methods such as cache-based and cooperative deployment are for eliciting deeper reasoning capabilities in LLMs, applicable
proposed. The key objective is to take advantage of large- to a range of complex reasoning tasks. While still evolving,
scale LLM’s high accuracy, while reducing the dependency on this approach opens new avenues for LLM application across
computational resources. On the other hand, these approaches diverse problem domains such as the telecom field. In addition,
may be combined, e.g., deploying small-scale on-device LLMs prompt-based planning represents a sophisticated approach to
and then using larger cloud models to update the local models navigating complex tasks, enhancing LLM’s problem-solving
periodically. Given these deployment methods, many critical capabilities through structured action sequences, feedback
problems can then be investigated, e.g., service delay eval- integration, and continuous plan refinement. Such planning
uation and task offloading, which still require more research capabilities are very important for telecom applications since
efforts. For example, Chen et al. proposed a NETGPT scheme many telecom tasks involve multi-step thinking with compli-
in [113], involving offload architecture, splitting architecture, cated procedures. For instance, the resource allocation may
and synergy architecture for cloud-edge collaboration. include multi-layer controllers [4], and optimization problems
can involve several agents and elements [114]. Therefore,
G. Analyses of LLM Fundamentals in the Telecom Domain multi-step planning and thinking should be carefully designed
Previous Sections III-A to III-F have covered the key tech- for LLM-enabled telecom applications.
niques of LLM fundamentals, ranging from model architecture Evaluation metrics are critical to assess the LLM’s perfor-
and pre-training to evaluation and deployment in telecom mance in telecom environments. For instance, efficiency is one
networks. This subsection will analyze how these fundamental of the most important metrics that should be considered in tele-
techniques can be applied to the telecom domain. com applications since many tasks require rapid or even real-
For telecom applications, pre-training an LLM from scratch time responses. Therefore, LLMs with long inference times
can be time-consuming. It first requires extensive dataset may be inappropriate for these mission-critical applications,
collection, and the dataset preprocessing has to consider the e.g., Ultra-Reliable Low Latency Communications (URLLC).
format of complicated telecom equations and theories. Mean- In addition, evaluating the performance of LLMs should also
while, it also requires considerable computational resources include their proneness to hallucination and ethical standards,
to pre-train LLMs, leading to heavy burdens for telecom e.g., LLM may make misleading or even wrong decisions in
networks. By contrast, a more efficient approach is to fine- network management. As LLM design and models continue
tune a general-domain LLM for specific telecom-domain tasks. to evolve and integrate more deeply into various aspects of
Applying LLM technologies to the telecom domain requires an society, the criteria for their evaluation will likely expand and
in-depth understanding of these fine-tuning techniques, such become more sophisticated. Ensuring that LLMs are accurate,
as instruction and alignment tuning methods. In particular, reliable, efficient, and ethically responsible is essential for
instruction tuning involves carefully constructing and selecting their sustainable and beneficial integration into human-centric
instruction datasets, employing strategic tuning methodolo- applications.
gies, and considering practical implementation aspects. These Finally, practical deployment is the prerequisite for applying
strategies will significantly improve the performance, gener- LLM to telecom networks. Compared with other domains
alization, and user alignment of LLM technologies in the such as education or healthcare, many telecom tasks have
telecom domain. On the other hand, alignment tuning is a stringent requirements for delay and reliability, which require
multifaceted process involving the setting of ethical guidelines, more efficient and reliable model output. Meanwhile, telecom
collection of human feedback, and application of advanced devices usually have limited computational and storage re-
fine-tuning techniques such as RLHF. However, adapting these sources. Therefore, efficient model training, fine-tuning, infer-
state-of-the-art fine-tuning techniques to telecom environments ence and storage techniques should be explored [16]. With
is still an open question. The fine-tuning process is usually previous knowledge and analyses, we will present detailed
task-specific, which requires professional knowledge of vari- LLM-inspired techniques and applications in telecom tasks in
ous telecom domain tasks. Instruction tuning can be a promis- terms of generation, classification, optimization, and prediction
ing method for building a telecom-LLM by using existing problems in the following sections.
telecom knowledge, but the dataset collection can be difficult
due to privacy issues.
Prompting techniques are especially useful for solving real-
IV. LLM FOR G ENERATION P ROBLEMS IN W IRELESS
time telecom tasks with stringent delay requirements, e.g.,
N ETWORKS
resource allocation and user association. It means that LLMs
can directly learn from the inputs and generate desired outputs
without extra training, avoiding the tedious model training The outstanding generation capability is one of the most
process in conventional ML algorithms. For instance, ICL attractive features of LLMs. This section first introduces the
provides a framework for leveraging the LLM in new task do- motivations for applying the LLM technique to telecom-related
mains without explicit retraining, with its effectiveness heavily generation tasks, and then it presents detailed application
influenced by the design and structure of demonstrations. scenarios, including telecom domain knowledge generation,
Meanwhile, CoT prompting has emerged as a potent method code generation, and network configuration generation.

12
TABLE IV
S UMMARY OF LLM- AIDED GENERATION - RELATED STUDIES IN THE NETWORK FIELD .

Refer-
Topics Proposed LLM-aided generation schemes Key findings & Conclusion Telecom application opportunities
ences
Adapting a BERT-like model to the telecom The proposed technique achieved F 1 score of
domain and testing the model performance by 61.20 and EM score of 36.48 on question
[115]
question answering downstream task in the answering in a small-scale telecom question
Techniques such as customizing
target domain. answering dataset.
LLMs to understand
It proposed a multi-stage BERT-based
1) Presenting more information in the query can and apply telecom-specific
approach to understand the textual data of
produce a better list of recommended solutions; language, evaluating their
[116] telecom trouble reports, and then generate a
2) Creating a small candidate list is the key to genuine understanding of
ranked solution list for new troubles based on
reducing the model latency. domain knowledge generation
previously solved troubleshooting tickets.
can contribute to more
The experiment includes nearly 18500 trouble
It combines a BERT-like method with transfer efficient, reliable, and secure
Domain reports, showing that combining pre-trained
learning for trouble report retrieval, telecom service applications.
knowledge [23] telecom-specific language models with fine-tuning
leveraging non-task-specific telecom data and Existing studies have
generation strategies outperforms pure domain adaptation
generalizing the model to unseen scenarios. demonstrated the capability
fine-tuning.
of LLM techniques to be applied in
Bard and GPT4 show promise with respect to
Question answering test on various LLMs, telecom, including question
accuracy and could be useful for telecom domain
[117] e.g., GPT 3.5, GPT 4, and Bard, including answering, literature review,
question and answering. LLM’s summarization
telecom knowledge and product questions. generating troubleshooting report.
requires reliability tests.
It shows great promises to build
Demonstrating the efficacy of integrating
next generation communication
domain-specific grammars with LLMs in
Integrating domain-specific grammars into networks.
enhancing their ability to generate structured
LLMs to guide the generation of structured
[118] language outputs tailored to specific domains. It
language outputs, enhancing performance in
emphasizes the potential of this approach to
domain-specific tasks.
significantly improve LLM performance in
domain-specific tasks.
The LLM is capable of refactoring, reusing, and
validating existing code. With proper design and
Using LLMs to generate Verilog code for
prompting, LLMs can generate more complicated
wireless communication system development
[14] projects with multi-step scheduling. LLM greatly Code is the cornerstone of modern
in FPGA. The experiment was implemented
reduced the coding time of undergraduate and communication networks, and the LLM
in the OpenWiFi project.
graduate students by 65.16% and 68.44%, provide promising opportunities to
respectively. improve the efficiency and reliability
Combining the LLM with proper libraries, such of codes, and meanwhile greatly save
It proposed a framework to use LLM to
as GPT-4 and NetworkX, can achieve 88% and human effort.
[119] generate task-specific code for traffic analyses
78% coding accuracy for traffic analysis and a) The LLM can refactor and
and network life-cycle management.
network lifecycle management tasks, respectively. validate existing code.
The students successfully reproduced networking This is very useful in
Code
Employing four students to reproduce the systems by prompting engineering ChatGPT. telecom filed, since
generation
[18] results of existing network studies with the They also achieve much lower lines of code by the network architecture is
assistance of LLMs. using ChatGPT, e.g., one of them is only 20% of constantly evolving and updated;
the open-source existing version. b) With proper prompting, the LLM
The proposed scheme successfully repaired a can generate complicated projects
Using LLM techniques for automated
larger fraction of programs (86.71%) compared to with multi-step scheduling
[120] program repair of introductory level Python
the baseline (67.13%), and adding few-shot requirements, which is very
projects.
examples will raise the ratio to 96.50%. common in telecom filed due to
Fine-tuning pre-trained LLMs on Verilog complicated network elements with
datasets collected from GitHub and Verilog Fine-tuning LLMs over a specific language can diverse functions.
[121]
textbooks and then generating Verilog improve the coding correct rate by 26%.
projects.
Through evaluating a service chain use case, the
paper found LLMs could generalize to new
It proposed a three-stage LLM-aided
intents through few-shot learning and concluded
[122] progressive policy generation pipeline for Telecom network operators
leveraging LLMs for policy generation is
intent decomposition. can leverage the LLM for network
promising for automatic intent-based application
configuration generation in
management.
various ways. This includes automatic
It proposed a multi-stage framework that
network provisioning, optimization
Network utilizes LLMs to automate network The results showed that state-of-the-art LLM
and performance tuning, security
configuration configuration by taking in natural language technologies like GPT-4 are capable of generating
[123] and compliance configuration,
generation requirements and translating them into formal fully working configurations from natural
fault diagnosis and troubleshooting,
specification, high-level configurations, and language requirements without any fine-tuning.
and network virtualization. The LLM
low-level device configurations.
enables efficient, reliable, and
The proposed scheme is able to synthesize
secure generation of network
It proposed a framework that combines LLMs reasonable though imperfect configurations with
configurations, reducing manual
with verifiers, using localized feedback from significantly reduced human effort, and coupling
[124] effort and improving network
verifiers to automatically correct errors in LLMs with verifiers providing localized feedback
management in telecom
configurations generated by the LLM. is necessary for real-world use configurations
environments.
despite requiring more testing.

13
A. Motivations of Using LLM-based Generation for Telecom research findings. By leveraging vast datasets of technical
This subsection will introduce the key motivations of us- documents, research papers, and standards specifications, LLM
ing LLM-enabled generation for telecom applications. Firstly, agents can produce detailed explanations and summaries that
LLM can make telecom knowledge more accessible. LLMs are tailored to the user’s level of expertise and interest. This
have been pre-trained on many real-world datasets and not only democratizes access to telecom knowledge but also
equipped with considerable knowledge from various fields. serves as a bridge to fill the gap between experts and non-
Therefore, question-answering has become the most well- expert users in the telecom field.
known application of LLMs. With domain-specific datasets 1) Understanding telecom domain knowledge: Telecom
from websites and textbooks, the LLM can extract professional is a broad field, and there are various domains of knowledge
knowledge from existing publications and then generate appro- such as signal transmissions, network architectures, communi-
priate answers based on users’ requests. For instance, Maatouk cation protocols, and industry standards. For instance, signal
et al. build a telecom knowledge dataset in [125], including transmission is fundamental telecom knowledge, involving the
25,000 pages from research publications, overview, and stan- differences between amplitude, frequency, and phase modula-
dards specifications. With proper training and fine-tuning, such tion, as well as the distinctions between digital and analog
a dataset can greatly contribute to a Telecom-GPT, providing a signals. Meanwhile, communication protocols refer to sets
systematic overview of hundreds of publications and standards. of rules that ensure standardized data transmission, allow-
With reasoning and comprehension capabilities, professional ing for interoperability among diverse systems. Knowledge
telecom knowledge will become much more accessible to all of these protocols is fundamental for the development and
researchers and even benefit the whole society. maintenance of robust communication networks. Additionally,
Meanwhile, LLM’s generation capabilities can also auto- telecom standards are equally important. Standards such as
mate many tasks that are usually time-consuming. For in- 3G, 4G, and the emerging 5G for mobile communications, as
stance, developing new standard specifications usually requires well as IEEE 802.11 for Wi-Fi, play a critical role in global
considerable writing, discussions, and reviews. By contrast, telecom networks [126]. They facilitate the seamless operation
given enough historical reports and proper prompts, the LLM of devices and services across different networks.
can produce a draft standard instantly, and then the experts A thorough understanding of the above telecom knowledge
can review it accordingly. Moreover, the experts’ comments is not only vital for the development of new technologies
can be fed directly to LLMs, and then the LLM can produce and services, but also for ensuring that systems are inter-
a new version efficiently, significantly saving human efforts on operable and secure. The depth of understanding in telecom
writing and revising paper works. Similarly, LLM technologies knowledge directly impacts the ability to innovate, secure, and
have been used to generate code in many existing studies, solve problems within the telecom field. The integration of
which is one of the most time-consuming tasks of modern LLMs, trained with domain-specific datasets, offers promising
industry [18], [120]. LLMs can refactor and improve existing avenues for automating knowledge generation and facilitating
codes, contributing to developing telecom projects. access to complex telecom content, thus bridging the gap
In addition, LLMs can easily learn from the provided between experts and general users.
existing examples, which is known as ICL. This capability is 2) Training LLMs with telecom-specific data: Training
particularly useful in generation tasks, and LLMs can quickly LLMs with telecom-specific data involves curating and pre-
generalize the given examples to related unseen scenarios. processing vast amounts of domain-specific information to
Meanwhile, if the initial generated output can not satisfy the fine-tune the models, aiming to generate accurate and relevant
requirements, users can also send the feedback directly to the content within the telecom field. This process is crucial as
LLM input, and then the LLM agent will revise the generation it tailors the LLM’s capabilities to understand and generate
accordingly. This user-friendly generation approach will lower content that aligns with specific telecom requirements. It can
the difficulty of applying LLM techniques to generation tasks be summarized by following steps:
in telecom, which usually requires considerable professional • The first step in training the LLM with telecom-specific

knowledge and experience. data is the collection of datasets. These datasets may
Given the above motivations and advantages, it is crucial to include technical documents, research papers, standards
exploit LLM’s generation capabilities and apply them to tele- specifications, and other forms of professional literature
com networks. Table IV summarized LLM-aided generation- prevalent in the telecom sector. For example, Holm et
related studies and telecom application opportunities. In the al. [115] created a small-scale TeleQuAD to train the
following, we will introduce domain knowledge generation, question-answering capabilities of the build Bert-based
code generation, and network configuration generation. model. Similarly, 185,000 trouble reports [23] are in-
cluded to train a Bert-like model to generate automated
B. Domain Knowledge Generation troubleshooting tickets. However, these datasets are usu-
Generating domain-specific knowledge is an important ap- ally inaccessible to the public. By contrast, Maatouk et
plication of LLM technologies in telecom. In particular, al. [125] introduced a large dataset of telecom knowledge
it refers to creating comprehensive summaries, overviews, to provide systematic overviews and detailed explanations
and interpretations of telecom standards, technologies, and of standards and research findings.

14
Fig. 5. Using language models for automated troubleshooting in telecom fields [23].

• Following dataset collection, the preprocessing stage 3) Using LLM to telecom knowledge-related genera-
involves cleaning and organizing the data to make it tion tasks: After proper training or fine-tuning, using LLMs
suitable for training. This step may include removing to generate telecom domain knowledge is a transformative
irrelevant information, correcting errors, and converting approach that leverages the model’s ability to process and
the data into a format that is compatible with the ML synthesize vast amounts of information into coherent, accessi-
model. The study [127] shows that preprocessing large- ble content tailored to the needs of various stakeholders in
scale datasets for LLM training can improve the model’s the telecom field. This capability extends from generating
learning efficiency and output quality. summaries of complex technical documents to answering
• Finally, it is worth noting that there are two main ap- specific queries with detailed explanations, thereby facilitating
proaches to train LLMs, which are training the model a deeper understanding of telecom technologies, standards,
from scratch or fine-tuning a general-domain LLM. In and practices. In the following, we present some existing
particular, training the model from scratch may produce applications of telecom knowledge-related generation tasks.
better performance since the model can specialize in
telecom language, but it is also time-consuming. On the Telecom-domain question answering: Question answer-
other hand, the fine-tuning process adapts the pre-trained ing is one of the most well-known applications of LLM tech-
LLM to the telecom domain. This step involves training nologies. Using LLMs to answer domain-specific questions
the model on the collected telecom-specific dataset, al- is grounded in the model’s ability to interpret and articulate
lowing it to adjust its parameters to better understand complex information in a manner that is both comprehensive
and generate telecom content. Fine-tuning enables the and understandable. For example, Soman et al. evaluated
model to grasp the unique terminologies, concepts, and the capabilities and limitations of existing pre-trained general
contexts of the telecom field, significantly enhancing its domain LLMs in [117], including GPT-3.5, GPT-4, Bard, and
generation capabilities. Although fine-tuning a pre-trained LLaMA. For instance, one telecom-domain question is ”What
LLM is much more efficient than training from scratch, are the different 5G spectrum layers?” GPT-4 identifies the
the experiment in [115] proves that training the model bands as below 1 GHz, 1-6 GHz and above 6 GHz, while
on telecom-domain text from scratch can achieve better LLaMA identifies the frequency bands as below 600 MHz, 600
performance than fine-tuning a general-domain model. MHz-24 GHz and above 24 GHz. These differences could be
caused by different data sources of GPT-4 and LLaMA in the
The integration of telecom-specific data into LLM training pre-training period. However, this could easily confuse or even
is not just about enhancing the model’s knowledge base; it’s mislead users without professional knowledge, which shows
about equipping the LLM with the ability to understand the the importance of training a telecom-domain LLM specifically.
nuances and complexities of the telecom field. This tailored Holm et al. [115] further investigate how various training
training approach ensures that the LLM can generate content methods can affect the model performance, e.g., pre-training
that is not only informative but also practical and applicable a model using telecom knowledge from scratch or fine-tuning
to real-world telecom challenges. an existing general-domain model. In summary, LLM-enabled

15
question answering democratizes access to advanced telecom programming assignments, and the experiment on 286 real stu-
knowledge, making it accessible to a broader audience, in- dent programs achieves a repair rate of 86.71%. For hardware
cluding researchers, practitioners, and the general public. In description languages like Verilog for FPGA development,
addition, LLM agents can also tailor the generated content Du et al. [14] show that LLM can reduce nearly 50% of
based on the user’s level of expertise and specific interests. the coding time for undergraduate and postgraduate students
There is an increasing number of commercial LLM products and improve the quality by 44.22% for undergraduates and
for generative question answering over business documents, 28.38% for postgraduates. Existing studies [14], [120], [121],
e.g., nexocode and Caryon. By leveraging the comprehensive [129] have shown that LLM can refactor and improve existing
understanding and generation capabilities of LLM technolo- codes. In addition, well-crafted prompts and designs can
gies, the telecom industry can enhance the accessibility of tackle complex, multi-step coding challenges encompassing
complex information, support educational endeavours, and multiple sub-tasks. Given these potentials, introducing LLM-
streamline development processes. aided coding into telecom can greatly save human effort
Generating troubleshooting solutions for telecom trou- in coding, validating, and debugging while providing more
ble reports: Telecom networks are complicated large-scale efficient and reliable codes for telecom network scheduling
systems, and it is critical to identify, analyze and then resolve and management projects.
both software and hardware faults, which are known as trouble 1) LLM for code refactoring: Code refactoring is a com-
reports. The authors in [116] and [23] investigated using lan- mon task that is frequently involved when developing wireless
guage models to understand previous trouble reports and then communication systems. Code refactoring aims to improve the
generate recommended solutions. Grimalt applied a BERT- readability, efficiency, and reliability of existing code [130].
based model to generate and rank multiple possible solutions For instance, good readabilities can lower the difficulty of
for a given system fault in [116], which archives a nearly 55% long-term maintenance and reuse of existing code modules.
correct rate. Then, Bosch [23] improved the model in [116] Readability is also a critical requirement for wireless networks
by including transfer learning and non-task-specific telecom since the network architectures and protocols are constantly
data to improve the generalization capabilities on handling evolving and updated, e.g., from WiFi 6 to 6E and WiFi 7, and
unseen trouble reports. Fig.5 summarizes the proposed scheme from RAN to cloud RAN and Open RAN. However, real-world
in [116] and [23]. One can observe that the analysis and projects usually include multiple contributors with different
correction phases can be time- and effort-consuming, which coding styles and mixed qualities. Such an issue could be
usually requires professional knowledge of telecom networks very common in telecom, which are considered as complicated
and devices. To this end, a language model-enabled method large-scale systems that include multiple modules with diverse
is proposed. It considers trouble report observation, headings, functions. Therefore, improving code readability, efficiency,
and fault areas as input and generates the top-K possible and reliability becomes more important for the telecom field.
solutions. Then, the generated candidate solutions are sent Fig. 6 shows an example from [14], which applies ChatGPT
back for verification. In particular, the fine-tuning process of to revise the original code of an open-source FPGA-based
the language model consists of three main steps, including project OpenWiFi [131]. The pink fonts indicate the changes
the telecom language dataset, MS MARCO document rank- made by ChatGPT. In particular, ChatGPT suggests using
ing dataset [128], and trouble report dataset. Here, the MS meaningful names for modules and variables, e.g., replacing
MARCO dataset is included to train question-answering and the name “DelayT” with “DelayBuffer”. Meanwhile, four
ranking models, in which a large number of question-answer comments are added to improve the readability of the input and
pairs are collected from search engines [128]. Fig.5 proves that output. The input and output data type specification “wire” is
using language models to generate solutions for automated added from line 2 to line 5, providing more explicit definitions
troubleshooting can significantly improve overall efficiency, and higher reliability. ChatGPT also recommends adding the
enabling faster response and repair for telecom. negative edge of active-low reset signals in the “always” block
Finally, it is worth noting that these models may generate in line 9 of the revised code. Du et al. [14] explained that such
misleading or even wrong solutions, which can be caused by an asynchronous reset is more reliable and the system can
different data sources, training strategies, and so on [117]. For make instant responses when detecting errors, without waiting
instance, the best correct rate in [23] is around 60%, and there- for the rising edge of the clock signal.
fore, verification is crucial before real-world implementation. In addition, code validation is also an important task for
telecom project development. Du et al. [14] utilized ChatGPT
C. Code Generation to generate an error-free testbench for effective OpenWiFi
Efficient and reliable code is of paramount importance project validation. However, the fine-tuning process is not
to intelligent communication networks. Recent studies have investigated, which can be a prerequisite for effectively gen-
demonstrated the strong coding capability of LLMs, including erating hardware description languages. Different from the
commonly-used languages (e.g., Python [120], [129]) and aforementioned studies, Thakur et al. [121] fine-tuned a pre-
hardware description languages (e.g., Verilog [14], [121]). trained LLM on Verilog datasets collected from GitHub and
For instance, Zhang et al. [120] apply the LLM to build textbooks, demonstrating that fine-tuned LLMs can improve
an automatic program repair system for introductory Python the coding correct rate by 26% on a specific language.

16
Fig. 6. Using ChatGPT to improve the code quality of OpenWiFi project [14] (The pink fonts show the main changes).

2) LLM-aided code generation with multi-step schedul- complicated task with several sequential or parallel subtasks;
ing: Previous sections have shown that LLM can be used b) The LLM lacks the capabilities of multi-step scheduling.
for fundamental coding tasks. However, real-world telecom To this end, the authors decouple the problem into four steps:
project development is usually much more complicated by
including multi-step scheduling and several sub-tasks. Xiang • Step 1: Asking ChatGPT to generate two simple IP cores
et al. applied LLM to regenerate the code of existing studies in that are frequently used in the following FFT design:
[18], and the authors suggested that ChatGPT does not respond
well to monolithic prompts like ”implement this technique
in the following steps”. Instead, a more practical method I am working on an FPGA project in Verilog.
is to send a detailed modular prompt each time. Such a Please write two IP cores for me. The first IP
step-by-step approach is also investigated in [119] and [14]. core is for butterfly computation for FFT. Here is
Specifically, Mani et al. [119] applied LLMs to network graph its template:....
manipulation, and the prompt design is decoupled into the The second IP core is for complex multiplication
application prompt and code generation prompt. Specifically, in FFT. I will use it to multiply the output of
the application prompt can provide task-specific prompts based a butterfly computation with the twiddle factor
on templates and user queries, and then the code generation provided... Here is a template of the IP core...
prompt can use plugins and libraries to instruct LLMs. The
experiment shows that combining the LLM with proper li-
braries, such as GPT-4 and NetworkX, can achieve 88% and • Step 2: Showing ChatGPT a simple 2-point FFT example
78% coding accuracy for traffic analysis and network lifecycle with templates and suggestions and then asking ChatGPT
management tasks, respectively. Du et al. investigated a more to produce a 4-point FFT IP core:
complicated coding task in [14] by using Verilog to build
a Fast Fourier Transform (FFT) module. A failure is first
observed by using the following prompt: ”I am writing a four-point DIF-FFT on FPGA.
You can use the following IP cores to build the
A failed prompt in [14] to generate FFT module. target four-point FT IP core. Here is the template
of butterfly computation IP Core...”
Help me write an FFT module for my FPGA ”And here is the template of the two-point FFT
system in Verilog language. Here are details of IP Core..”
my specifications: ... ”Further, I also have some suggestions for you...”
I also provide you with the instantiation template:
...
• Step 3: Asking ChatGPT to develop an eight-point FFT
The generated code failed because: a) FFT computation is a module based on the generated 4-point FFT in Step 3:

17
”I am writing an eight-point DIF-FFT on FPGA. D. Network Configuration Generation
Apart from IP cores Given in Question One,..., Network administrators orchestrate the flow of informa-
you can also use the fft 4 point IP core generated tion within a network. They can guide data from source
in Answer one. You need to look back to Question- to destination by configuring a complex set of parameters
1 and Answer-1 for detailed input/output informa- for network elements. These configurations impact a wide
tion on the four IP cores. Once again, I want to range of devices and services, such as switches, routers,
emphasize that:...” servers, and network interfaces. To ensure a reliable data
stream, these settings require precise calibration across all
network functionalities. Over the past ten years, both academic
• Step 4: Finally, asking ChatGPT to generate a 16-point institutions and the commercial sector have embraced the
FFT using the 8-point FFT that has been generated in concept of Software-Defined Networking (SDN) [132] as a
Step 3. This step is repeated in [14] by asking for a 2N - means to streamline network management, marking a shift
point FFT module based on previously generated N -point away from the older, more rigid networking models. SDN
FFT modules. offers numerous advantages; nonetheless, adjusting network
settings remains a task that often requires manual input. Such
manual adjustments can be expensive, as they demand the
Steps 1-4 is an obvious step-by-step CoT approach. Instead skills of specialized developers familiar with various network
of asking for an 8-point FFT module directly, it starts from protocols, and meanwhile such manual configurations are also
two simple IP cores and then provides examples of 2-point intricate and prone to errors. Numerous initiatives have been
FFT modules with detailed suggestions. This is a very useful launched with the aim of streamlining the translation of over-
technique for LLM-aided project design in telecom networks, arching network guidelines into individual settings for each
decoupling the objective into several steps with detailed ex- network component. Such efforts focus on reducing human
amples and suggestions. errors by creating verifiable and reliable configuration outputs
Finally, we summarize some key lessons from existing through rigorous checks [133], [134]. Nonetheless, setting up
studies on the use of LLM for code generation. Firstly, step- network configurations is still considered as a labour-intensive,
by-step prompt design is an important lesson that has been intricate, and costly endeavour for network operators.
demonstrated in several existing studies [14], [18], [119]. Recent advancements have demonstrated that the LLM
Decoupling the complicated multi-step scheduling problem possesses the ability to generate cohesive and contextually rel-
into several stages will lower the difficulty for LLM’s under- evant content. They can answer questions and sustain in-depth
standing. For instance, in 5G cloud RAN simulation, we can conversations with users. Applications like GitHub Copilot
divide the network into cloud, edge, and users, and then use and Amazon CodeWhisperer exemplify these advancements,
LLM to generate the code for each part sequentially. Secondly, assisting with a variety of programming-related tasks. These
examples and pseudo-code are important for code generation. developments inspire confidence that the LLM can also be
The LLM has excellent ICL capabilities, quickly learning utilized to generate network configurations [24], [122].
from examples and generalizing to other scenarios. Xiang One notable development of LLM-aided network configu-
et al. [18] also reveal that implementation with pseudocode ration is CloudEval-YAML [24], a benchmark that provides
first can produce stabilized data types and structures, avoiding a realistic and scalable assessment framework specifically
other changes when implementing the following components. for YAML configurations in cloud-native applications. This
There have been many codes for the telecom field in GitHub benchmark utilizes a hand-crafted dataset and an efficient
and textbooks, taking advantage of these existing examples evaluation platform to thoroughly examine the performance
is crucial to use LLM techniques. Then, a significant amount of LLMs within this context. Dzeparoska et al. [122] have
of human effort can be saved in code generation by using introduced a pioneering method that employs the few-shot
LLMs for debugging and testing. Xiang et al. [18] also learning capabilities of the LLM to automate the translation of
shows that most errors can be solved by sending the error high-level user intents into executable policies. This approach
message to the LLM. Many of these errors are related to data facilitates dynamic, automated management of applications
types, which can be avoided by specifying key variables’ data without the necessity for predefined procedural steps. In a
types. This lesson is also proved in [14], in which the LLM related vein, Wang et al. [123] have developed NETBUDDY,
specified the data types of inputs and outputs to improve the a multi-stage pipeline that leverages LLMs to translate high-
reliability of existing code. Finally, LLM-aided coding can level network policies specified in natural language into low-
lower the requirement for professional knowledge [14], [18]. level device configurations. NETBUDDY first uses an LLM
In particular, Du et al. [14] show that both undergraduate and to convert the input into a formal specification, such as a data
graduate students can benefit from the assistance of LLMs, structure to express reachability. It then generates forwarding
achieving comparable coding qualities. Xiang et al. [18] prove information and configuration scripts from the formal specifi-
that undergraduate students can reproduce the results of some cation. Finally, NETBUDDY interacts with an LLM multiple
existing network studies by using the LLM. times to sequentially provide topology, addressing details and

18
Fig. 7. Frameworks for LLM-based network configuration generation.

prototype programs to automatically generate vendor-agnostic These existing studies have demonstrated the potential of
configurations for the switches and routers. The evaluation of using the LLM to configure networks automatically, which
the network emulator demonstrates NETBUDDY’s ability to can be very useful in configuring telecom network settings.
enforce path policies and dynamically modify existing deploy- The LLM offers promising opportunities for the automation of
ments. In addition, Mondal et al. [124] presented Verified tedious tasks, reduction of human error and cost, and rapid pro-
Prompt Programming (VPP) to improve GPT-4’s ability to totyping and deployment of network infrastructure. However,
synthesize router configurations. VPP combines GPT-4 with telecom networks are complex systems with numerous inter-
verifiers like Batfish [135], which check configurations for dependent components, and there are still many challenges to
syntax errors and semantic differences. Experiments showed applying LLM technologies to telecom network configuration,
that VPP presented 10× leverage performance for translating a e.g., contextual understanding, error handling and verification,
Cisco configuration to Juniper format by identifying and fixing security concerns, and interoperability between vendors and
syntax errors, structural mismatches, attribute differences, and devices. For example, networks often comprise devices from
policy behaviours through 20 automated prompts. Implement- various vendors, each with its own configuration language
ing no-transit policies across 6 routers achieved 6× leverage and parameters. The LLM must be capable of understanding
performance with 12 automated prompts guiding GPT-4 to and generating configurations that are compatible across these
resolve syntax, topology, and semantic policy errors. diverse environments. In addition, network configurations must
adhere to security best practices. The LLM must be equipped
Fig. 7 summarizes three frameworks for LLM-based net- to understand and apply these practices consistently to avoid
work configuration generation. In particular, the first frame- creating security vulnerabilities.
work employs a simplistic design, directly utilizing LLM
to generate network configurations from natural language. E. Discussions and Analyses
However, the generated configurations may be inaccurate and LLM techniques have promising generation capabilities for
require human inspection and improvement. In the second telecom applications, and Sections IV-B to IV-D have intro-
framework [123], a hierarchical design is employed, where duced various scenarios for generating telecom knowledge,
multiple LLMs collaborate to generate low-level network troubleshooting reports, code, and network configuration. Ta-
configurations step-by-step, aiming to enhance the final output. ble V summarized the main features, input and fine-tuning
The verification scheme is crucial to evaluate the quality requirements, advantages, and telecom applications. In the
of the produced configuration, which may be placed in the following, we summarize the key findings and analyses.
second design as in [124] and [123] to check the syntax, Firstly, multi-step planning capabilities are crucial for
compilability, and correctness of the generated output. The telecom-related generation tasks. Telecom networks are large-
third framework [124] is an automated design, incorporating scale complicated systems, and many tasks require dedicated
an automatic verifier once the configuration is generated. This planning and scheduling. For example, the study in [14]
verifier validates the configuration and allows the LLM to demonstrated that using a one-step prompt to generate a
automatically refine the output. While human inspection is complicated 64-point FFT module is impractical, while step-
still necessary, this approach significantly reduces the extent by-step planning can achieve a satisfactory result. Similarly,
of manual intervention required. It is worth noting that these step-by-step reasoning and planning are also useful to repro-
frameworks are not mutually exclusive and can be combined. duce the results of existing publications for code generation
For instance, in the hierarchical design, an automatic verifier problems [18]. Therefore, multi-step planning, i.e., step-by-
can be added after each LLM iteration. step prompt design, is critical to obtain the desired output.

19
TABLE V
S UMMARY OF LLM- BASED G ENERATION FOR T ELECOM .

LLM-based
Specific Prompt and fine-tunning Advantages compared with
generation Main features Applications for telecom fileds
scenarios requirement conventional approaches
applications
Question answering is the
most well-known application General domain LLMs can The use of LLM techniques
of LLMs. It represents a also answer signifies a shift towards more
significant step forward in the telecom-related questions, efficient and accessible 1) Building a telecom-domain
Telecom-
ongoing effort to bridge the but fine-tuning a knowledge dissemination LLM is a promising direction
domain
knowledge gap in telecom and telecom-specific LLM can methods than any existing to make telecom-knowledge
question
empower individuals and provide more reliable and textbooks, websites, and more accessible for both
answering
organizations within the field, professional answers. CoT tutorials for their professional researchers and
Domain
including telecom question prompting may improve comprehension and reasoning the public.
knowledge
answering, literature summary the answer quality. capabilities. 2) Automated troubleshooting
generation
and review, etc. is another promising
The language model must application to automate the
Using language models to be fine-tuned on Automated troubleshooting is problem-solving process
generate troubleshooting telecom-domain language a very promising technique to in telecom fields.
Generating
solutions automatically. It and trouble reports greatly save human time and 3) The LLM also have the
solutions based
considers trouble observations datasets. An extra effort, since the conventional potential of generating
on trouble
and fault information as input, document ranking approach relies on expert other language-related tasks,
reports
and produces recommended fine-tuning is required to knowledge and trial-and-error e.g., specifications
solutions. realize recommendation tests. and protocols.
functions.
The prompt input is easier
Using the LLM for
since no multi-step 1) Improving the readability, Coding is one of the most
fundamental code refactoring
scheduling is involved. efficiency, and reliability time-consuming part in
Code and design validations,
Fine-tuning the LLM of the project. wireless system development.
refactoring improving the code quality
based on existing codes 2) Considerably saving human Incorporating LLM-aided
automatically without human
Code can improve the quality of effort on coding, debugging, coding can greatly save human
intervention.
generation the generated code. and testing the project. effort and improve the code
The input prompts have to 3) Lowering the requirement quality. However, datasets may
The LLM can also be used to
Coding tasks be carefully designed in a for professional knowledge be required for fine-tuning,
generate complicated projects
with multi-step CoT approach with when developing a system. which can be collected from
with multi-step scheduling and
scheduling appropriate examples and GitHub or wireless textbooks.
sub-tasks.
templates.
The prompt must be
Automatic carefully designed due to The LLM enables efficient Applications include
Using the LLM to generate
Network network the complexity of network generation of network automatic network
network configurations
configuration configuration configurations, e.g., configurations, reducing provisioning, performance
automatically, and then verify
generation generation by dividing the prompts into manual effort and cost in the tuning, security and
by LLMs or humans.
using LLMs. the task-specific part and telecom industry. compliance configuration, etc.
code generation part.

For instance, the prompt design [119] is decoupled into the on domain-specific datasets, e.g., telecom [125] and cyberse-
application prompt and code generation prompt, in which curity [136]. Despite the satisfactory performance, there is no
the application prompt focuses on task-specific requirements, guarantee for the correctness of the generated output. Such
and the code generation prompt uses plugins and libraries to risks are avoided when the generated code or network config-
instruct LLMs. uration can be verified by pre-designed test cases. However,
Meanwhile, LLM-enabled generation can significantly save when using the LLM to summarize or extract knowledge
humane efforts. Existing studies have shown that LLM has ex- from existing literature, the quality of generated knowledge
cellent capabilities for code refactoring and validation, which is hard to validate. For example, LLMs may produce wrong
are usually solved manually with considerable human effort. numbers or units, and these mistakes can easily mislead
Applying such a technique to the telecom field will signifi- users without professional knowledge. To this end, efficient
cantly save human labour on projecting coding, validating, and validation schemes are crucial to evaluate the performance
debugging. For instance, Zhang et al. [120] introduce that the of generated solutions, especially for coding and network
LLM can successfully repair 86.71% programs for introduc- configuration problems. Human verification is a simplistic and
tory level Python projects, and adding few-shot examples will straightforward approach, but it requires considerable human
raise the ratio to 96.50%. In addition, LLMs can also abstract labour and can be inefficient. Therefore, automatic validation
fundamental knowledge in the network field from textbooks, is the key to improve the overall efficiency of the whole
journals, and specification standards, which avoids the time- pipeline, e.g., sending the code implementation error message
consuming literature review process. to a LLM for automatic debugging [18], and using LLMs to
LLMs have been trained on many real-world datasets from validate the network configuration files.
web pages like Wikipedia, and they can be further fine-tuned

20
V. LLM- ENABLED C LASSIFICATION P ROBLEMS also have remarkable image processing capabilities in vision-
related tasks [151], including image-to-text generation [152]
Classification problems are extensively studied within tele-
and object detection [153], etc. Consequently, this integration
com networks. Accurate and robust classification is crucial
enables the LLM to analyze both visual and network data,
for improving network service quality and performance. This
which can effectively bridge the gap between textual and vi-
section will introduce the motivations and capabilities of LLM
sual data analysis, leading to a more comprehensive approach
technologies in addressing a range of classification problems,
for network management.
including attack classification and detection, text classification,
Finally, LLM’s zero-shot classification capabilities have
image classification, and encrypted traffic classification.
been demonstrated in multiple existing studies, such as text
and image classification tasks [26], [144]. In particular, it
A. Motivations and Classification Capabilities of LLMs
means that the LLM can be used to classify and detect
Conventional classification techniques heavily rely on sta- objects by using the real-world knowledge learned in the pre-
tistical methods. However, with the recent advancements in training phase, and no extra training is required for target tasks.
telecom networks, there has been a surge in multi-modal and Such zero-shot classification capabilities can be appealing for
heterogeneous network data, e.g., numerical traffic data, tex- telecom networks since many telecom classification tasks need
tual security logs, and environmental images, which presents rapid responses, e.g., network attack detection [139], image
significant challenges for traditional classification techniques, processing and classification [25]. With the above potential and
indicating a need for more advanced and adaptable approaches. motivations, in the following, we will introduce LLM-enabled
Recently, LLM techniques have shown their capability to attack classification and detection, text classification, image
effectively process multi-modal and heterogeneous data across classification, and encrypted traffic classification problems.
both natural language and computer vision fields. These ca-
pabilities position them as a promising research direction for B. LLM for Telecom Security and Attack Detection
addressing classification problems within telecom networks. The numerous advancements in telecom have led to more
Firstly, LLM technologies can contribute to telecom se- complex and interconnected infrastructures with a wide range
curity by automated security language understanding and of technologies, protocols, and services, which can pose
classification. Inspired by numerous advancements in NLP, significant challenges in controlling and monitoring telecom
LLM excels at explaining textual contents and transforming security. The growing threats and incidences of hostile attacks
them into informative representations, such as GPT [149] and have exposed severe vulnerabilities in telecom. For instance,
BERT [53]. With their strong capabilities, LLMs have recently Denial of Service (DoS) can decrease network availability
shown exceptional superiority in attack detection, aiming to by overwhelming systems, and Man-in-the-Middle (MITM)
enhance the security of telecom networks [139]. Through fine- attacks can violate network integrity by secretly modifying
tuning pre-trained models or developing LLMs from scratch, communications between two parties. This underscores the
LLMs retain the functional capabilities in general English requirement for robust attack detection mechanisms to monitor
while gaining a thorough understanding of the specialized the network system against malicious activities. However, with
security language, allowing LLMs to effectively identify and the evolution of current telecom networks, a surge of multi-
respond to security threats in telecom networks. modal network data can be captured, including numerical
Secondly, LLM techniques have inherent advantages in measurements such as traffic loads and CSI, and descriptive
text-related classification tasks, which are very useful for textual contents with device logs and network configurations.
text processing and classification in the telecom field, e.g., The data contains a substantial amount of redundant and
customer textual feedback, telecom standard specifications, correlated information, potentially obscuring critical patterns
technique reports and publications. For example, enhancing in attack detection, which poses significant challenges to
the quality-of-experience (QoE) of telecom services hinges achieving accurate attack detection.
on a comprehensive understanding of customer feedback, Recently, NLP has achieved numerous successes in captur-
which may include various real-world topics, ranging from ing informative features from multi-modal and heterogeneous
signal strength to sending messages and calls [150]. Given data across various application scenarios, including sentiment
the superiority across various text-related tasks, LLMs have analysis, speech recognition, and machine translation, among
strong capabilities to classify customer comments and extract others. Specifically, LLM techniques have emerged as a
useful feedback, allowing telecom operators to enhance service promising direction across various NLP applications, which
quality by properly understanding user satisfaction levels. are beneficial to explaining textual inputs and transforming
In addition, LLMs can extract visual features from the them into quantitative representations. The common method
dynamic and complex telecom environment. The integration of to apply LLMs across various domains involves employing
computer vision and image processing into the telecom field, general-domain models as baselines, followed by fine-tuning
such as equipping BSs with cameras to pinpoint user locations, them for specific domain tasks. To enhance the security of
can boost network efficiency in the dynamically changing telecom networks through LLM techniques, it is important
wireless environment. Although primarily focusing on pro- to note that the security language, such as ransomware, API,
cessing and understanding textual information, some LLMs OAuth, and keylogger, significantly differs in structure and

21
TABLE VI
S UMMARY OF LLM- AIDED CLASSIFICATION - RELATED STUDIES AND TELECOM APPLICATIONS .

Refer- Proposed LLM-aided generation


Topics Key findings & Conclusions Telecom application opportunities
ences schemes
Building a specialized cybersecurity SecureBERT excels at understanding text
language model (named SecureBERT) within a cybersecurity context, which enables
[136]
through fine-tuning RoBERTa [54] on a strong generalization capability across
a cybersecurity corpus. various telecom security tasks.
Building a security-specific LLM
SecurirtBERT showcases the powerful By fine-tuning pre-trained general
from scratch designed for detecting
predictive capabilities of security-specific LLM models [136]–[138] or
network cyber threats, involving
[139] LLMs in identifying various types of attacks, building security-specific
several steps: data preparation, data
significantly outperforming the traditional ML models from scratch [139], [140], LLM
tokenization, model training, and
and DL models. models exhibit the advantage in
model deployment.
understanding security context,
Building a novel classifier of
enabling the application
cybersecurity feature claims (named
CyBERT enables the effective identification of LLM techniques across a
CyBERT) by fine-tuning a pre-trained
Security of cybersecurity claim-related sequences, with range of telecom security tasks.
BERT language model [53] on
related [137] an accuracy improvement of 19% in Existing studies show that
industrial control device documents.
classification comparison to the general BERT text LLM-based method can outperform
A large repository is created to gather
classifier [53]. existing ML and DL models in
industrial device information
terms of classification and
encompassing 41073376 words.
detection accuracy. LLM techniques
Applying transfer learning to a BERT
The exploitability prediction framework can also provide incident recovery
model [53] to extract changeable
(named ExBERT) not only accurately predicts suggestions. However,
token embeddings from vulnerability
[138] software vulnerabilities but also learns it is essential to initially create relevant
descriptions. A pooling layer is
sentence-level semantic features and captures training and testing datasets extracted from
positioned at the top to extract
long dependencies within descriptions. security-related telecom language corpora.
sentence-level semantic features.
Applying a BERT model [53] to By integrating NLP with web attack detection,
tokenize URLs within HTTP requests BERT [53] demonstrates strong capabilities in
and then passing these tokens to a understanding web requests and SQL
[140]
multilayer perceptron model to language, achieving remarkable detection
distinguish normal and anomalous performance that significantly surpasses that
HTTP requests. of traditional ML detection methods.
It applied an AraBERT model to
BERT-based model obtained more accurate LLM techniques have inherent
classify telecom customer satisfaction
[141] and stable results than conventional CNN and advantages in processing
in Saudi Arabia by using the Twitter
RNN algorithms. text-related tasks. Existing studies have
Text dataset.
classification Fine-tuning several LLMs, e.g., shown that the LLM can achieve a
With proper pre-processing and fine-tuning, comparable performance as existing CNN
BERT, distilled BERT, RoBERTa and
the experiment in [13] can achieve an 80% or RNNs. It is promising for text-related
[13] GPT-2, to the telecom domain
accuracy even if only 20% of the text telecom tasks such as standard developing
languages, and using them for 3GPP
segments are used. and user feedback processing [142].
standard classification problems.
It investigates the zero-shot image The performance can be significantly
classification capabilities of LLaVA improved with a combination of carefully
[26] Images are important information
model, which means using the model crafted prompts, hierarchical classification
for telecom sensing. Enabling efficient image
Image directly without any extra training. strategies, and adjusted model temperatures.
classification can be very useful for
classification Using LLM’s inherent knowledge to
This simple approach can effectively improve many telecom applications, including
generate descriptive sentences with
[144] the zero-shot image classification accuracy on vision-aided sensing, mmWave beamforming
crucial discriminating characteristics
a range of benchmarks. [25], user localization [143], and so on.
of the image categories.
Capturing long-distance contextual
relations within traffic sequence
BiLSTM can capture relevant features of The LLM facilitate effective encrypted traffic
through BERT, and then integrating
front and rear token sequences after BERT classification, a critical technique in telecom
packet-level token semantic features
[145] extracts general features of encrypted traffic, network management while protecting data
at the forward and backward positions
learning the long-distance relations within and user privacy. Note that the assumption of
of BiLSTM, which enhances the
token sequences. clean pre-training data presents challenges in
BiLSTM attention to packet-level
Network secure traffic classification. This vulnerability
features.
traffic ET-BERT showcases strong effectiveness and is exposed particularly when attackers craft
Building an Encrypted Traffic BERT a poisoned model with backdoors by inserting
classification generalization across five encrypted traffic
(named ET-BERT), which aims to low-frequency words as toxic embeddings.
classification tasks, e.g., General Encrypted
[146] learn generic traffic representations Such manipulation allows attackers to deceive
Application Classification [147], Encrypted
from large-scale unlabeled encrypted the normally fine-tuned model during specific
Malware Classification, Encrypted Traffic
traffic. classification tasks.
Classification on VPN [148], etc.

semantics from the general linguistic language. This suggests LLM techniques for attack detection can be categorized into
that a conventional LLM may find it challenging to fully two primary directions as follows:
understand the specific vocabulary inherent to security-related 1) Fine-tuning pre-trained LLMs: Existing studies have
texts, potentially leading to limited generalization ability in leveraged pre-trained LLMs and adapted them to achieve
security applications. To this end, existing studies that employ specific security objectives through model fine-tuning [136].

22
Fig. 8. Framework of LLM-based attack detection [139].

For instance, Aghaei et al. [136] introduce a specialized Comparisons between SecureBERT and RoBERTa in
cybersecurity language model named SecureBERT, which is masked tasks [136]
capable of processing texts with cybersecurity implications
and effectively applied across a broad range of cybersecurity Task 1: “Information from these scans may reveal
tasks, including phishing detection, intrusion detection, code opportunities for other forms <mask> establishing
and malware analysis, etc. In particular, SecureBERT applies operational resources, or initial access.”
a cybersecurity corpus comprising 1.1 billion words, divided SecureBERT: reconnaissance.
into 2.2 million documents, with each document averaging 512 RoBERTa: early.
words through the Spacy text analytic tool [154]. To build the Task 2: “Search order <mask> occurs when an
security-customized tokenizer, a byte pair encoding method adversary abuses the order in which Windows searches
is employed to extract 50265 tokens from the cybersecurity for programs that are not given a path.”
corpus to generate the initial token vocabulary. Among all SecureBERT: hijacking.
the extracted tokens, SecureBERT and RoBERTa [54] share RoBERTa: abuse.
32592 mutual tokens, while SecureBERT identifies 17673
Task 3: “Botnets are commonly used to conduct
tokens specific to the cybersecurity corpus, including firewall,
<mask> attacks against networks and services.”
breach, crack, ransomware, malware, phishing, and vulnera-
SecureBERT: DDoS.
bility, among others. Each token is represented by an embed-
RoBERTa: automated.
ding vector with dimensions identical to those in pre-trained
RoBERT, augmented with random Gaussian noise added to
the embedding factor of each token. SecureBERT emulates The three predicted terms reconnaissance, hijacking, and
the architecture framework of RoBERT [54], encompassing DDoS are prevalent in cybersecurity corpora. SecureBERT
twelve transformer and attention layers, which are trained accurately understands the security context to predict these
on the specifically collected corpus through the customized masked words, whereas RoBERTa exhibits incorrect predic-
tokenizer tailored to the unique task requirements. tion, underscoring the advantages of SecureBERT in security-
related language tasks.
2) Building security-specific LLMs from scratch: In addi-
tion to fine-tuning, another strategy is to build an LLM from
scratch specifically designed for network-based attack detec-
tion. For example, Ferrag et al. [139] designed SecurityBERT
for detecting the ever-evolving cyber threat landscape, which
involves several steps: data preparation, data tokenization,
SecureBERT is evaluated to predict masked security-related model training, and model deployment, as shown in Fig. 8.
words within a sentence, which is the task known as masked In particular, the authors utilize a publically available dataset
language models. The testing dataset is generated by extracting EdgeIIoTset [155] related to the Internet of Things (IoT)
sentences from cyber-security reports with 17341 records. The and Industrial IoT (IIoT) connectivity protocols, categorized
experiment shows that SecureBERT outperforms RoBERTa, into five types of threats: DoS/D-DoS attacks, information
powerful model on general language, in predicting masked gathering, MITM attacks, injection attacks, and malware at-
words within a sentence with a security context, as illustrated tacks. Then, to leverage the power of LLMs, null features are
in the following examples [136]: eliminated during the feature extraction in [139], and both

23
numerical and categorical features are converted into textual
representations. Specifically, each feature is combined with
its column name and value and then subjected to hashing.
The hashed values from the same instance are merged into a
sequence, which generates a fixed-length textual representation
of the network traffic data while maintaining privacy. After
that, ByteLevelBPETokenizer [156] is subsequently applied to
segment the textual representations of the network traffic data.
This segmentation process breaks down the text into smaller
subwords, expected to be found in the tokenizer’s vocabulary.
After the pre-training phase, the model is fine-tuned on a
labelled dataset [155] by adding a Softmax activation function
at the output layer, which allows SecurityBERT to enable the
learned contextual representations in the specific task of attack
detection. Finally, in the deployment phase of Fig. 8, once
attacks are identified through SecurityBERT, FalconLLM is
further employed to determine the severity and negative impact Fig. 9. Framework of LLM-aided 3GPP specification classification [13].
of identified attacks, leading to the formulation of potential
mitigation strategies and recovery procedures.
SecurityBERT is employed to identify normal events and 14 useful for telecom operators to improve service quality such
distinct attacks in [139], such as DDoS UDP, DDoS ICMP, as signal coverage and strength. However, user feedback can
SQL Injection, Vulnerability Scanner, etc. The experiment be complicated by involving service experiences, suggestions,
shows that SecurityBERT achieves the average accuracy, recommendations, and complaints. In addition, the feedback
recall, and F1-score of 0.98, 0.84, and 0.84, respectively, can be collected from various sources, e.g., social media,
demonstrating the strong classification capabilities of security- websites, phone calls, and company collection. These chal-
specific LLMs in identifying various types of attacks. In ad- lenges require more advanced ML techniques to better classify
dition, SecurityBERT significantly surpasses the performance and capture user’s intentions. The LLM has shown superb
of traditional ML and deep learning models such as decision performance in a range of text-related tasks, e.g., question
tree, convolutional neural networks (CNN), recurrent neural answering, summarization, dialogue, and sentiment analysis,
network (RNN), and long short-term memory (LSTM). outperforming many existing techniques even in zero-shot
Finally, to develop security-specific LLM technologies for settings. For instance, Aftan et. al applied AraBERT model
telecom networks, it is essential to initially create relevant to classify telecom customer satisfaction in Saudi Arabia by
training and testing datasets extracted from security-related using the Twitter dataset [141], and the BERT-based model
telecom language corpora. Following model fine-tuning with obtained more accurate and stable results than conventional
security-customized tokenizers, these language models can CNN and RNN algorithms. In addition, using LLMs to analyze
significantly boost performance across various telecom se- customers’ experience and intent has attracted considerable
curity tasks, including cyber threat intelligence, vulnerability interest from both industry and academia, e.g., Microsoft has
analysis, and threat action extraction [157], [158]. proposed to use LLMs to generate, validate, and apply user
intent taxonomies [160]. Therefore, it shows great promise in
C. Text Classification integrating LLM technologies into the telecom industry for
Text classification and processing is a useful technique text-related classification tasks.
for the telecom industry, and the applications include user 2) LLM-aided telecom standard classification: Telecom
enquiries and intent classification and analyses [142], auto- standards refer to agreed-upon specifications that ensure the
mated trouble report classification [159], standard specification interoperability, security, and reliability of telecom services.
classification [30], and so on. In the following, we introduce Standards play a critical role in global telecoms [126], such as
two applications in telecom customer feedback analyses and 3G, 4G, and emerging 5G for mobile communications, IEEE
specification classification. 802.11 for Wi-Fi, and ITU-T recommendations. For instance,
1) Using LLMs for telecom user feedback classification 3GPP is the main organization for telecom standard develop-
and analyses: Understanding user feedback is crucial for ment, which includes three technical specification groups, and
telecom operators to improve the QoE and maintain customer each specification group consists of multiple working groups.
satisfaction and loyalty. For instance, Vieira et al. applied Given the large number of existing specifications with diverse
CNN and LSTM networks in [142] for sentiment analysis topics, Lina et al. proposed to use LLMs for specification
and topic classification, and the analysis proved that 78.3% classification, classifying the text into an existing working
of the complaints are related to weak signal coverage, and group automatically [13]. Fig.9 summarized the key processes
92% of these regions have coverage problems considering a of using LLMs to classify the 3GPP specifications. With
specific cellular operator. These analyses can be particularly proper pre-processing and fine-tuning, the experiment in [13]

24
can achieve an 80% accuracy even if only 20% text segments
are used. The experiment results also prove that increasing
the length of technical text segments can significantly improve
classification accuracy.
Textual descriptions and documents are frequently involved
in the telecom industry, e.g., user comments, standard specifi-
cations, technical and troubleshooting reports, etc. Incorporat-
ing LLMs for text processing and classification will contribute
to more intelligent and reliable telecom networks.
D. Image Classification
Computer vision is an important approach for environment
sensing, and there have been many existing studies toward
vision-aided 6G networks. For instance, vision-aided blockage
prediction and beamforming are investigated in [25] and [161].
Specifically, the authors assume that the cameras attached to
the BS can capture the environment image and then use deep
learning to detect objects and 3D user locations. These studies
have shown the importance of incorporating computer vision
and image processing in telecom fields to better sense the wire-
less environment. Therefore, efficient image processing and
object classification are the prerequisites for realizing vision-
aided wireless networks. For example, Civelek et al. [162]
proposed an automated moving object classification technique
Fig. 10. Framework of LLM-aided computer vision in wireless networks.
in wireless multimedia sensor networks, and such schemes can
also be exploited in previous studies such as [25] and [161]
for efficient object detection. In addition, Kim et al. propose
works. In particular, the cameras attached to the BS can
an edge-network-assisted real-time object detection framework
capture environmental images, and then the image data will be
[163]. Specifically, the vehicles can compress the image based
sent to the LLM at the network edge by wired backhaul. The
on the region of interest and transmit the compressed one to
LLM can use computational resources at the edge cloud for
the edge cloud. Considering the limited computation resources
image processing, classification, and detecting object locations
at the BS, this can be a useful technique for BS-edge-cloud
such as vehicles, users, and blockage buildings. After that,
image processing and environment sensing.
the edge cloud can send back the classification and detection
The wireless environment can be very complicated with results to BSs, and then the BS can adjust the beamforming
walking pedestrians, moving vehicles, building blockages, and and hand-off decisions accordingly.
other obstacles. Therefore, it requires dedicated model training
and fine-tuning to extract useful information and identify spe-
E. Encrypted Traffic Classification
cific objects. LLMs have been pre-trained on a huge amount
of real-world datasets, and some LLMs, such as flamingo Network traffic classification is an essential technique in
[151] and GPT-4V [164], have proved versatile capabilities on telecom network management, which aims at identifying the
various vision-related tasks, e.g., using text to generate images, category of traffic from various applications [165]. Specifi-
describing given images, and object detection. For instance, cally, the widespread utilization of traffic encryption plays a
Matsuura et al. investigate the zero-shot image classification significant role in protecting data and user privacy. However,
capabilities of the LLaVA model [26], and they found that it also presents challenges in capturing implicit and robust
the performance can be significantly improved with a combi- patterns within encrypted traffic, which is essential for network
nation of carefully crafted prompts, hierarchical classification management. To tackle these challenges, conventional meth-
strategies, and adjusted model temperatures. Meanwhile, Pratt ods [147] usually extract features within encrypted traffic such
et al. [144] also demonstrate that using LLM’s knowledge as certificates to create fingerprints for classification through
can immediately improve zero-shot accuracy on a variety of fingerprint matching, while these methods fall short with
image classification tasks, saving considerable manual effort. the advent of advanced encryption techniques. Additionally,
In addition, LLMs can also describe and summarize the existing ML-based studies [166] can automatically extract
image content for further classification, documentation, and complex and abstract features to analyze encrypted traffic,
processing, and an example is given in [17] by generating the resulting in notable performance improvement. However, these
accident report of a car crash. methods are heavily dependent on the amount and distribution
Finally, Fig.10 presents an example of using LLMs for of labelled training data, leading to limited generalization
image classification and object detection in radio access net- ability due to model bias.

25
adjacent network packets representing the session information.
(2) BURST2Token applies a bi-gram model to convert the
datagram of each BURST into token embeddings and divides
a BURST into two segments for subsequent pre-training tasks.
(3) Token2Embedding merges the token embeddings, position
embeddings, and segmentation embeddings of each token as
the input representations for pre-training. To demonstrate the
effectiveness and generalization of ET-BERT, the authors con-
duct experiments across several encrypted traffic classification
tasks, e.g., general encrypted application classification [147],
encrypted malware classification, encrypted traffic classifica-
tion on VPN [148], with the remarkable improvements over
existing state-of-the-art methods by 5.4%, 0.2%, and 5.2%.
Although ET-BERT exhibits a strong generalization capabil-
ity across various tasks, the assumption of clean pre-training
Fig. 11. Framework of LLM-based encrypted traffic classification [146]. data presents challenges in secure traffic classification. This
”BURST” refers to a set of time-adjacent network packets originating from vulnerability is exposed particularly when attackers craft a
the request or the response in a single session flow, and therefore a group of
BURSTs can characterize the network flow transmission patterns. poisoned model with backdoors by maliciously inserting low-
frequency words as toxic embeddings. Such manipulation al-
lows attackers to deceive the normally fine-tuned model during
Recently, pre-training-based methods have achieved great specific classification tasks. Hence, how to construct toxic
breakthroughs across a wide range of application fields. In tokens within encrypted traffic can be potentially investigated
particular, pre-trained models are designed to learn data repre- as a promising direction in the field of LLM-based encrypted
sentations from unlabelled data, allowing these representations traffic classification.
to be effectively applied to downstream tasks through fine-
tuning models on labelled data. In the context of encrypted F. Discussions and Analyses
traffic classification, Ma et al. [145] capture long-distance Table VII summarized LLM-enabled classification tech-
contextual relations within traffic sequence through BERT, niques in terms of the main features, prompt and fine-tuning
and then integrate packet-level token semantic features at the requirements, advantages, and network classification applica-
forward and backward positions of BiLSTM, enhancing the tion opportunities. It shows LLM’s versatile capabilities on
BiLSTM attention to packet-level features. BERT-BiLSTM different classification tasks, ranging from textual security logs
is evaluated to identify the types of network communication and customer comments to images and network traffic files.
application activities using the ISCX VPN dataset [167], In particular, Section V-B demonstrates that LLM tech-
which includes various pcap files corresponding to different niques have great potential for telecom network security.
application activities. The dataset is comprised of 17 label Security is an important topic for telecom operations, and
categories, with each label representing a distinct type of LLMs can contribute through their classification and detection
application activity, including Email, Facebook, Gmail, Net- capabilities. In particular, the LLM can handle multi-modal
flix, SCP, Skype, Youtube, and Spotify, among others. BERT- and heterogeneous network data, e.g., CSI, traffic load level,
BiLSTM effectively distinguishes each application, achieving network device logs and network configurations, and then
an overall accuracy of 99.70%, precision of 99.34%, recall extract useful network security information from these cor-
of 99.51%, and F1-score of 99.43%, thereby surpassing the related inputs. Additionally, some LLMs can also recommend
performance of traditional ML methods. The performance en- response and recovery strategies for network incidents [139].
hancement further indicates the advantages of BERT-BiLSTM This indicates the potential of building an end-to-end telecom
in encrypted traffic classification: (1) BiLSTM captures the security system, from status monitoring and attack detection
relevant feature of front and rear token sequences after BERT to incident response and recovery.
extracts general features of encrypted traffic, learning the Meanwhile, LLM can serve as a zero-shot classifier. Tele-
long-distance relations within token sequences. (2) BiLSTM com networks indicate a complicated dynamic environment,
captures packet-level features and contextual relations by si- leading to various tasks. Existing methods are usually task-
multaneously integrating packet-level token semantic features specific, with dedicated designs for each incoming request.
at both forward and backward starting positions of BiLSTM. By contrast, some LLMs have shown zero-shot classification
Moreover, Lin et al. [146] introduce Encrypted Traffic capabilities [26]. For instance, they can be directly used to
BERT (ET-BERT), as shown in Fig. 11, which aims to classify images captured by the cameras on the BS, or analyze
learn generic traffic representations from large-scale unla- customer comments without prior training. Such a feature
belled encrypted traffic. Concretely, Datagram2Token is first can be very useful in handling diverse tasks in complicated
utilized to convert traffic flow into word-like tokens, through telecom systems such as object detection and user localization.
three steps: (1) BURST Generator extracts BURST time- In addition, LLMs have outstanding capabilities in processing

26
TABLE VII
S UMMARY OF LLM- ENABLED CLASSIFICATION FOR TELECOM .

LLM-based
Advantages compared with Network classification
forecasting Main features Prompt and fine-tuning requirements
conventional approaches application opportunities
techniques
Fine-tuning LLMs on security-specific LLM techniques have the strong
Attack detection is vital for
language emerges as a promising advantages of processing both LLMs can be effectively
maintaining the security and
approach for attack detection. This numerical traffic loads and employed for detecting cyber
reliability of telecom networks.
Attack method allows fine-tuned LLMs to descriptive security-related textual attacks [136], [139] and
LLMs have showcased a strong
detection maintain their proficiency in contents, e.g., ransomware and contributing to the mitigation
ability to capture discriminative
processing general English vocabulary keylogger, achieving better and recovery strategies against
information within multi-modal
while excelling at achieving specific performance than existing ML and such attacks [139].
and heterogeneous network data.
security objectives. DL algorithms.
The telecom applications
Automatic text classification and
Text classification and processing Fine-tuning LLMs on telecom include user enquiries and
processing will greatly save
are very useful for the telecom language is required, e.g., network intent classification and
Text human efforts on many
industry. LLMs have shown great trouble report datasets and 3GPP analyses [142], automated
classification document-related tasks, e.g.,
promise in understanding and technical specifications. There are no trouble report classification
automated troubleshooting report
processing text and languages. specific requirements for prompts. [159], standard specification
generation and ranking.
classification [30].
The study in [26] shows that carefully LLM’s zero-shot learning
Computer vision is a very useful crafted prompts are critical to capability can avoid the LLM-aided image
technique for 6G networks, improving the classification complexity of dedicated model classification can be used for
Image enabling 3D sensing for the performance of LLMs, e.g., ”Fill in training. For instance, [26] blockage prediction [161],
classification environment. Some LLMs have the blank, this is a picture of {...} ”. achieved a satisfactory proactive beamforming and
shown impressive capabilities in However, fine-tuning LLMs on performance by pure prompt hand-off [25], user
image and vision-related tasks. telecom-image datasets can improve engineering without any localization [143], etc.
classification accuracy. fine-tuning.
Network traffic classification is an
essential technique in network LLMs are capable of learning
Traffic analyses and
management and security, which Fine-tuning LLMs on labeled network generic traffic representations from
classification are very common
aims at identifying the category of data is crucial for ensuring their extensive amounts of unlabeled,
Traffic tasks in telecom networks.
traffic from various applications. adaptability across various traffic encrypted traffic without plaintext,
classification LLMs can be effectively
LLM techniques have classification scenarios, such as single resulting in extracting valuable
applied for encrypted traffic
demonstrated remarkable packet and single flow classification. insights from encrypted traffic for
classification [145], [146].
performance in encrypted traffic downstream traffic classification.
classification.

text-related tasks, including both natural languages, such as introduces LLM-aided reinforcement learning, black-box op-
customer comments and system language like network log timizer, convex optimization, and heuristic algorithms along
files. These textual tasks are usually performed manually, but with network optimization applications. Finally, we analyze
LLMs can easily handle different classification and detection and summarize the key findings.
tasks with much less human intervention.
Finally, LLMs can contribute to vision-aided telecom. Sens- A. Motivations and Optimization Capabilities of LLM
ing is increasingly important for wireless networks, and com- Optimization problems have been widely investigated in
puter vision is an important approach to capturing wireless the communication field due to their critical importance.
environment dynamics. With pre-trained real-world knowl- Existing optimization techniques can be categorized into sev-
edge, LLMs can be directly used for image and vision-related eral approaches [7]: ML-based, convex optimization, heuristic
tasks, such as image description, image-text transformation, algorithms, and black-box optimization. For instance, rein-
object detection, and image classification. In addition, LLM forcement learning is a widely considered ML algorithm
technologies also have advantages over conventional algo- to solve optimization problems [4]. Meanwhile, fractional
rithms in terms of generalization capabilities. This means programming is a well-known convex optimization technique
that LLMs can process various telecom tasks without extra in wireless networks, e.g., decoupling signal strength with in-
training, e.g., blockage detection and prediction by using BS terference and noise to maximize the data rate [168]. Heuristic
cameras [161], proactive beamforming and hand-off [25], and algorithms are particularly useful for solving problems with
user localization [143]. integer control variables [169], and black-box optimization
is also a useful method to handle problems with unknown
VI. LLM- ENABLED O PTIMIZATION T ECHNIQUES FOR
objective function structure [170].
TELECOM
However, applying these techniques to telecom is not
Optimization techniques are of paramount importance to straightforward. For instance, the reward function is an im-
telecom network management, and this section presents LLM- portant part of implementing reinforcement learning, but the
enabled optimization techniques. It first analyzes the mo- corresponding design can be difficult without professional
tivations and optimization capabilities of LLMs, and then knowledge of telecom. Moreover, the reward function may be

27
related to multiple network metrics such as delay, throughput, along with telecom network optimization applications.
and packet drop rate, incorporating these metrics into the
reward function usually follows a time-consuming trial-and- B. LLM-aided Reinforcement Learning for Network Optimiza-
error manner [171]. Similarly, although there have been many tion
commercial convex optimization solvers, e.g., CPLEX and Reinforcement learning is one of the most important tech-
LINDO [172], it is worth noting that optimization problems niques for network optimization. It explores various sequential
have to be formulated in standard form, i.e., relaxing specific action combinations, e.g., network resource allocation strate-
constraints for convexity or continuity, which are considered as gies and signal transmission power level, to maximize the
obstacles for the application of convex optimization. To this long-term reward, such as higher data rate or lower transmis-
end, existing studies have shown that LLM may offer new sion delay [183]. Many network optimization problems can be
opportunities to overcome the theory-application gap between transformed into a unified Markov decision process (MDP),
existing optimization techniques and real-world telecom ap- and then using reinforcement learning to improve network
plications. There are multiple advantages to exploiting LLM- metrics dynamically. For instance, resource allocation is a
enabled optimize techniques for telecom: very common problem in many telecom scenarios, in which
Firstly, LLMs demonstrate a strong ability to follow human allocation decisions, desired network performance metrics, and
instructions. Specifically, the LLM agent has the potential network dynamics are usually defined as actions, rewards, and
to formulate problems, design algorithms, select models, and states, respectively [6]. However, it is worth noting that these
finally optimize the system performance based on human pref- definitions are usually intuitive and require expert knowledge
erences and language instructions [173]. With LLM-enabled of reinforcement learning techniques and telecom. Especially,
intelligence, operators can easily manage the network opera- most reward functions are manually designed using trial-and-
tion using simple natural language input, and then LLM can error approaches, and the algorithm performance is affected by
automatically select proper ML algorithms to implement tasks the hyperparameter selection, e.g., learning rate, batch size,
with minimum human intervention. and number of hidden layers. Fortunately, LLM techniques
Secondly, LLMs can lower the training and fine-tuning provide new opportunities to overcome these bottlenecks.
difficulties of ML-based network optimization. Algorithm This section will introduce two LLM-aided reinforcement
training is considered one of the main obstacles to realizing learning techniques: automatic reward function design and
AI-enabled wireless networks, which is usually very time- verbal reinforcement learning.
consuming. By contrast, LLM has shown impressive few-shot 1) Using LLM for reward function design: A recent
or even zero-shot learning capabilities in many fields [182]. In survey in [27] shows that 92% reinforcement learning re-
particular, LLMs can learn in context from few or zero network searchers use manual trial-and-error reward function design
management and optimization examples and then generalize and 89% indicate that the designed rewards lead to unintended
to incoming new tasks. By providing a handful of examples, behaviour during algorithm training [184]. Such issues become
the LLM agent can quickly learn the hidden patterns without more difficult in complicated telecom scenarios since various
any extra model training and fine-tuning, saving considerable network elements are involved, e.g., users with diverse re-
time and effort for algorithm training in network management. quirements, limited available resources, and dynamic network
In addition, such fast learning capability is critical to make environments. To this end, LLM shows the capability of
rapid responses to network dynamics. This means that network developing a universal approach for reward design, which
optimization decisions can be efficiently adjusted based on will significantly lower the difficulty of using reinforcement
traffic patterns, user types, and operator demands. learning. For instance, Song et al. [42] proposed a self-
Finally, the rich real-world knowledge of LLM will con- refined LLM for automated reward function design in robotics,
tribute to network optimization algorithm modelling and de- achieving a comparable performance as manually designed
sign. LLM is equipped with rich internalized knowledge functions. Kwon et al. [43] applied LLM as a proxy reward
about the world in the pre-training stage [63]. Such diverse function, where the user provides a textual prompt with a
knowledge can contribute to comprehending user preferences, few examples or a description of the desired behaviour. In
task requirements, and even optimization algorithm modelling the following, we will introduce how the automatic reward
and design. For instance, LLM can already understand the fun- function is designed.
damental concepts of reinforcement learning and linear pro- An MDP can be defined as a tuple < S, A, R, T >, where S
gramming without any extra training, and both techniques are and A are the set of environment states s ∈ S and actions a ∈
very useful in optimizing telecom networks. This real-world A, respectively. T is the transition probability with T (s, s′ ) =
knowledge eliminates the gap between real-world network op- P r(s′ |s, a), indicating the probability of taking action a under
timization demands and problem modelling and design. Table state s and reaching the next state s′ . R is the reward with R =
VIII summarizes existing studies on LLM-aided optimization F(s, a), where F is the reward function that maps the states
techniques, including proposed methods, key findings, and and action selection to an immediate reward [185]. The reward
telecom application opportunities. Given these motivations and feedback R will further affect the action selection policy π
the benefits of applying LLM to telecom optimization, we will with a = π(s), which means the action selection is under
introduce state-of-the-art LLM-aided optimization techniques the current state s. However, the definition of such a reward

28
TABLE VIII
S UMMARY OF LLM- AIDED O PTIMIZATION T ECHNIQUES S TUDIES .

Refer-
Proposed LLM-aided optimization techniques Key findings & Conclusion Telecom application opportunities
ences
A LLM framework with a self-refinement mechanism for LLM-designed reward functions can rival
[42] automated reward function design, where LLM can formulate or even surpass manually designed reward Reinforcement learning is very
an initial reward function based on natural language inputs. functions in 9 robot control tasks. useful for network optimization,
It considers a universal reward design by prompting the LLM and automatic reward design
The generated rewards are well-aligned
as a proxy reward function, where the user provides a textual /universal proxy reward function
[43] with the user’s objectives and outperform
prompt with a few examples or a description of the desired is an appealing approach to
supervised learning approaches.
behavior. lower the difficulty of applying
An LLM-aided reward design system with zero-shot reinforcement leaning
It outperforms human experts on 83% of
generation, code-writing, and in-context improvement techniques to various network
[44] the tasks, leading to an average
capabilities. It performs evolutionary optimization over management scenarios.
normalized improvement of 52%.
reward code.
It proposed a novel framework to reinforce language agents The proposed framework achieves a 91%
through linguistic feedback. The agent verbally reflects on accuracy on the HumanEval coding
[174] LLMs have self-improvement
task feedback signals, maintaining the reflective text in an benchmark, surpassing the previous
capability, which means they can
episodic memory buffer to induce better decision-making. state-of-the-art GPT-4 that achieves 80%.
work as an agent to receive
LLM generates new solutions from the prompt that contains The proposed prompt-design scheme
network environment feedback and
previously generated solutions with their values, then the outperforms human-designed prompts by
[21] improve the policies
new solutions are evaluated and added to the prompt for the up to 8% on GSM8K [175], and by up to
based on textual input.
next optimization step. 50% on Big-Bench Hard tasks [176].
1) The LLM show strong optimization Black-box optimizer is a promising
Evaluating the optimization capabilities of LLMs across capabilities; 2) LLMs perform well in approach to estimating the unknown
[177] diverse tasks and data sizes, including gradient descent, small-size samples; 3) They exhibit strong loss function, which is especially
hill-climbing, grid-search, and black-box optimization. performance in gradient-descent; 4) LLMs useful since telecom networks become
are black-box optimizers. more complicated.
A natural language-based system that engages in interactive The proposed system can assist both
conversations about infeasible optimization models. It expert and non-expert users in improving
[178] provides natural language descriptions of the optimization their understanding of the optimization Convex optimization is a commonly
model itself, identifies potential sources of infeasibility, and models, enabling them to quickly identify used technique for network
offers suggestions to make the model feasible. the sources of infeasibility. optimization, and integrating LLM
An LLM-aided system that can develop mathematical It achieves nearly 0.8 success rate in 41 with convex optimization can bring
[179] optimization models, write and debug solver code, develop linear programming and 11 mixed integer promising changes to network
tests, and check the validity of generated solutions. linear programming problems. optimization.
Generated a novel meta-heuristic
Using LLMs such as GPT-4 to generate novel hybrid swarm
[180] algorithm with pseudo-code by using 5 Heuristic algorithms are naturally
intelligence optimization algorithms.
existing algorithms. compatible with LLM techniques,
The LLM operator only learned from a since many heuristic rules can be
Using general LLM serves as a black-box search operator for few instances can have robust easily described by textual language.
[181] decomposition-based multi-objective evolutionary generalization performance on unseen It offers new opportunities
optimization in a zero-shot manner. problems with quite different patterns and for selecting and design new
settings. heuristic network optimization methods.

function F is not straightforward since mapping the state s and of LLM to a binary value, e.g., ”good” or ”bad”. This
action a to a specific value requires considerable experience binary value feedback indicates the quality criteria of
and trial-and-error tests. Therefore, the objective of reward the generation, and then the LLM easily understands the
design is to use LLM as a proxy reward function or generate overall feedback.
a reward function automatically [42], [43]. Given the above Given these definitions, the LLM-aided MDP definition be-
MDP fundamentals, extra prompt input is required as textual comes < S, A, R, T, L, M >, where L is a set of prompts l1
input for LLM. Consider a set of prompt string l ∈ L and a to l4 , and M is the mapping function. The reward function
mapping function M, we need to define: F in R = F(s, a) is defined by F := G(L, M), where G
• Task description l1 : The string or environment code to is the inference of LLMs. F := G(L, M) shows that the
describe the target task [44]; design of the reward function F depends on prompt input
• Objective description l2 : The optimization objective or L and the mapping function M. Based on the LLM-aided
desired final states of the task; MDP framework, using LLM for reward function design can
• States and actions description l3 : It explains the definition be summarized as the following steps:
of states and actions in the target task; • Step 1: Description input. Using language to describe the
• Examples description l4 : It provides a trajectory or exam- task, objective, states, and actions. If necessary, providing
ples of the episode. Note that a trajectory usually serves possible trajectory examples to LLMs. Here an alternative
as a demo, but it is not required in zero-shot learning. approach is to feed the environment code to the LLM
• A mapping function M that maps the textual output agent, and then using natural language to describe the

29
task, which has been used in [44]. metrics. These design problems can be more complicated if
• Step 2: Initial reward function design, which will use Step multiple network elements are simultaneously involved, such
1 as input, and produce initial reward function designs. as vehicle networks and RISs [7]. The simulations in [42]–
• Step 3: Reward function implementation. Using the re- [44] have demonstrated LLM’s capabilities in reward design
ward function produced in Step 2 to train the reinforce- for robotics and logic games, but the application in the telecom
ment learning agent. field is still an open issue.
• Step 4: Evaluation and feedback. Evaluating the rein- 2) Verbal reinforcement learning via LLM: Section
forcement learning training output and providing feed- VI-B1 proves that LLM can use the feedback to improve
back to LLMs. previous solutions. Given this self-improvement capability, a
• Step 5: Self-improvement. Sending the feedback and promising optimization technique is to consider LLM as an
evaluation results to the LLM agent, and then LLM will agent, interacting with the environment to explore optimal
produce a new reward function design. Repeating from policy. Verbal reinforcement learning is proposed in [174], and
Step 3 until the algorithm has the desired performance or achieved satisfied performance across diverse tasks, including
reaches the maximum iteration number. sequential decision-making, coding, and language reasoning.
To better explain how LLM-aided reward design can be Fig.13 shows an example of using verbal reinforcement learn-
used for network optimization. Fig.12 shows the procedure of ing for radio access network optimization, and the agent
solving a simple resource slicing problem [4]. In particular, consists of the following modules:
it considers resource allocation as an example with two types • Actor: The actor is built upon an LLM, which is specifi-
of users. Group 1 indicates URLLC users that desire lower cally prompted to generate actions, e.g., network control
latency and higher reliability, and group 2 represents enhanced and optimization strategies. Based on short-term and
Mobile Broad Band (eMBB) users with high throughput long-term memories, the actor can apply various meth-
demands. As shown in Fig.12, the resource allocation task is ods to produce actions, such as CoT [41] and ReAct
described by natural language as input for LLM, including task [186]. These advanced prompt techniques can improve
description and user features, objectives, states and actions, the actor’s capability of reasoning and planning, which
and reward design rules. Note that we use ”group 1” and can better adapt to the complicated decision-making of
”group 2” instead of ”eMBB” and ”URLLC” to lower LLM network optimization problems.
understanding difficulty. In addition, the features of the two • Evaluator: The evaluator is a critical module to assess
groups have been clearly defined. Then, LLM will generate the performance of the actor. In particular, it takes the
an initial reward function design and send the initial design short-term trajectories as input and produces a reward
to the reinforcement learning framework for evaluation. After score that shows the action quality. The evaluator can be
that, we will receive and analyze the evaluation results, e.g., defined in various approaches, e.g., a specified reward
convergence and system metrics. For instance, the evaluation function or heuristic criteria. For instance, in resource
results in Fig.12 show that the 5% drop rate of group 1 users is allocation problems, the evaluator can be defined by a
much higher than the predefined threshold 1%, and therefore reward function with network metrics, or a heuristic like
the overall evaluation for this design is ”bad”. It is worth ”all the users’ requirements have been fulfilled”. We
noting that the final evaluation of ”good” or ”bad” depends on still consider the radio access network as an example.
the user’s predefined criteria, which varies between different The evaluator’s internal feedback could be ”The average
scenarios. latency of network edge users is too high, and 10% edge
In Fig.12, if the evaluation result is ”bad”, then a detailed users’ communication demand is dropped. The overall
feedback summary is provided with possible suggestions, performance of this trajectory is bad.”
e.g., the drop rate of group 1 users is too high. Given this • Self-reflection: The self-reflection module is the most
feedback, the LLM agent will redesign the reward function important part of the verbal reinforcement learning
and repeat the process from Step 3. On the other hand, if the scheme, providing useful feedback instructions to the ac-
evaluation result is ”good”, the system will output the final tor. Specifically, with external feedback from the environ-
reward function design. The bottom module also shows an ment and internal feedback from the evaluator, the self-
example that the reward function is improved by iterations. reflection module can generate more detailed feedback
e.g., the coefficient of Drop rate group1 is increased from to the actor, which is far more informative than a pure
1 to 10, preventing dropping group 1 users. The coefficient reward value in conventional reinforcement learning. A
of T hroughput avg group2 is also improved to balance the feedback example could be ”Cell edge users should have
latency and through metrics of two groups. more resources if available, and cell edge means users
Reward function design is a prerequisite for applying re- that are far away from the BS than other users.”
inforcement learning to telecom, and LLM-aided automatic • Short-term and long-term memories: The memory mecha-
reward function design significantly lowers the difficulty. How- nism consists of short-term and long-term memories. The
ever, it is worth noting that some reward functions can be very long-term memory indicates important lessons learned
complicated in the telecom field, which may include transfor- from previous experience, while the short-term memory
mation functions like arctan or sigmoid and diverse network shows recent decisions and performance. This is an

30
Fig. 12. LLM for reward design in network optimization.

intuitive approach that is similar to the human brain resources; Type 1 users are delay sensitive, they should
with fine-grain recent details and important lessons from have higher priority.”
long-term memory. With the self-reflection mechanism, Compared with conventional reinforcement learning, the
the long-term memory will automatically learn important LLM-aided verbal learning technique has multiple advantages
rules, e.g., ”Cell edge users should have more available for telecom optimization: 1) Lowering the difficulty of imple-

31
of using LLM in a black-box manner. It starts by describing
the optimization task, and then LLM will generate an initial
solution. The generated solution will be evaluated by the
objective function evaluator, e.g., average or sum data rate,
average latency, etc. If the evaluated score is satisfied or it is
the maximum iteration number, then the system will output
the final solution. Otherwise, the current solution is sent to a
solution-score pairs pool, and a new prompt will be generated
accordingly for LLM as input. Here the solution-score pair
pool includes past experience and corresponding scores. By
comparing the similarities of high-score solutions, the LLM
can generate better solutions iteratively with few-shot learning
capabilities. To better understand how LLM can be used as
a black optimizer for network optimization, we provide an
example of BS power control [188]:
• Initial task description module:

BS power control task description

Fig. 13. LLM-aided verbal reinforcement learning for network optimization. “We have an interference control task related
to wireless network management. We need to
control the power level of two BSs to maximize
menting network optimization. Verbal reinforcement learning the average data rate. We need you to provide the
avoids the difficulty of tuning hyperparameters like learning transmission power of these two BSs, and adjust
rate, batch size, and training frequency. This will significantly the power based on provided feedback”.
lower the difficulty of applying artificial intelligence to net-
work optimization. 2) Allowing for language instructions to • Prompt inputs module for black-box optimization:
guide network optimization policies. Specifically, the LLM-
aided system allows for language instructions to guide the Prompt input for black-box optimization
agent exploration, which is much more efficient than existing
strategies such as ϵ-greedy policy. Experienced network oper- “Below are some previous power levels and the
ators can provide language instructions to LLMs directly, and corresponding data rate, which are arranged in
no ML knowledge is required. 3) Reasoning and interpretable descending order.
explanations for algorithm performance. One crucial advantage
Input: P level 1: 14 dBm, P level 2: 17 dBm;
of LLM-aided systems is that they provide interpretable ex-
Output Avg rate: 1.1 Mbps;
planations of the algorithm and telecom system performance,
... ... ... ... ... ...
and these experiences can further help understand network
Input: P level 1: 22 dBm, P level 2: 15 dBm;
management policies.
Output: Average data rate is 1.8 Mbps;
Despite the advantages, LLM-aided reinforcement learning Input: P level 1: 25 dBm, P level 2: 22 dBm;
is still at a very early stage, and there are very few studies Output: Average data rate is 2.5 Mbps.
that apply this technique to the telecom field. In addition,
specific telecom domain knowledge may be required to let the Give me a new power level input that is differ-
LLM better understand user demand. Therefore, professional ent from all the traces above and has a higher
wireless knowledge datasets such as TeleQnA in [125] may average data rate than any of the above”.
be required to fine-tune the LLM.
After the above prompts input, one can use the output to
C. LLM as a Black-box Optimizer update the candidate solutions and then repeat this process as
Black-box optimizer is also an appealing approach for net- shown in Fig.14 until obtaining a satisfactory solution. The
work optimization problems. It refers to the task of optimizing main advantage of black-box optimization is that it avoids the
an objective function f : X → R without access to any complexity of defining dedicated optimization models, which
other information about f , e.g., gradients or the Hessian [187]. have been used to automatically construct the wireless network
Telecom networks will become more and more complicated in optimization model in [189], and to optimize cellular network
the 6G era, and black optimization can avoid the complexity of coverage and capacity in [190]. For the LLM-aided black-
building dedicated optimization models. Existing studies have box optimizer, the existing example quality may affect the
shown that LLM has the black-box optimization capability to output results, and the algorithm performance cannot be guar-
fit an unknown loss function [177]. Fig.14 shows an example anteed. However, telecom management usually has stringent

32
some key elements of defining the problem, including
problem type, problem information, input and output
format, objective and solvers. Specifically, problem type
specifies the type of this problem, e.g., linear program-
ming, mixed-integer linear programming, quadratic pro-
gramming, etc. Problem information includes the core
description of the problem, which defines the relationship
between input and output variables. Then, input and
output variables show the expected input and output
variables along with definitions, i.e., network decision
variables and output metrics. Objectives and solvers give
the optimization objective and applied solvers. Such a
standard form and description will lower the difficulty of
LLM understanding.
• Telecom knowledge and formulation templates: Telecom
optimization requires professional network knowledge.
LLM has learned fundamental knowledge in the pre-
training period such as calculating information capacity
using Shannon’s formula. However, using state-of-the-art
telecom knowledge and formulation templates to fine-
tune the LLM can better improve the modelling accuracy.
For instance, a dataset named TeleQnA is defined in
[125], and it includes nearly 10000 communication field
Fig. 14. LLM-as a black-box optimizer for telecom. questions from both standards and research articles.
• LLM and Solvers: Existing studies have shown that LLM
can use the advanced features of existing solvers such
requirements on solution qualities to guarantee the service as Gurobi and cvxpy to solve the problems [191]. For
level, which can be an obstacle to using LLM techniques. instance, [179] observed that LLM can use the function
gurobi.abs to model L1-norm objective instead of adding
D. LLM-aided Convex Optimization for Telecom
auxiliary constraints and variables. It demonstrates that
Convex optimization is a crucial technique for telecom LLM has the potential to take advantage of existing
networks, and it is commonly used in many scenarios [7]. solvers to address complicated optimization problems. In
For instance, fractional programming is especially useful for addition, if the implementation fails, code-fix templates
wireless network optimization due to the fractional terms in can also be included to address the issues automatically
communication systems such as signal-to-interference-plus- and rerun the test.
noise ratio (SINR) and energy efficiency, which is applied to
wireless power control and beamforming [168]. Convex opti- In summary, Fig.15 shows an example of solving network op-
mization can provide stable and efficient solutions, especially timization problems in an end-to-end manner. Given a proper
when closed-form solutions are achieved. However, deploying problem description, the LLM-aided system can automatically
convex optimization techniques usually requires dedicated model the problem, generate code, and call the server to solve
problem modelling, transformation, and relaxation since the and debug the problem. Such a scheme has been used in [179]
original problems may be non-convex. Therefore, the require- to solve 41 linear programming and 11 mixed-integer linear
ment for expert knowledge may prevent the application of programming problems and achieved a nearly 0.8 success rate
convex optimization techniques. To improve the accessibility for small-scale problems using GPT-4. The study in [179] also
of convex optimization, the authors in [178] propose to use the observed that the success rate could be further improved by
LLM to diagnose the infeasibility of optimization problems, adding supervised tests and data augmentation.
aiming to relax or remove some infeasible constraints, and Despite the great potential, it is worth noting that telecom
LLM is used for convex optimization problem modelling, code networks have become more and more complicated, and there
generation and solving in [179]. The experiments in [178] are many complicated large-scale and non-convex optimization
and [179] have demonstrated that LLM has the potential to tasks. For example, RIS-related optimization problems usually
improve convex optimization techniques. include multiple control variables, e.g., RIS phase-shift control
Fig.12 shows the key steps of using LLM to solve network and BS passive beamforming, which are usually optimized in
convex optimization problems with the following modules: an alternating approach. It still requires dedicated human effort
• Problem modelling and description: Transforming the to transform the problems into standard forms [114]. However,
network optimization problem into a standard form is the LLM-aided automatic convex optimization is still a promising
first step of automatic problem modelling. Fig.12 presents approach that will save human time and effort on network

33
Fig. 15. LLM-aided Convex Optimization Problems.

optimization problem modelling and solving. these algorithms. For instance, inertia weight and local and
global best mechanisms are two key components in particle
E. LLM-based Heuristic Algorithm Design
swarm optimization, and then LLM can better understand the
Heuristic algorithms are very useful techniques for network functionality of each unique heuristic rule. After that, Tasks 3
management and optimization. Specifically, they apply diverse and 4 will generate the step-by-step design and pseudo-code
heuristic rules to select near-optimal solutions with low design of a new swarm-based meta-heuristic optimization algorithm.
and computational complexity [192]. Heuristic algorithms are Most importantly, Task 5 will take full advantage of LLM’s
particularly useful for solving optimization problems with reasoning capability, and explain how this novel algorithm is
integer control variables, which are very frequently formulated designed with step-by-step motivations.
in telecom. For instance, the phase-shift optimization of RISs In summary, Fig.16 presents an automatic approach for
is considered as a very difficult problem with integer control novel meta-heuristic algorithm design, which can be very
variables and large solution space, and genetic algorithms useful for telecom network control and optimization. For in-
and particle swarm optimization are used in [193] and [194] stance, many network control scenarios require rapid responses
to solve this problem. In addition, heuristic algorithms are for environment dynamics such as traffic load level and user
intuitively compatible with LLMs, since heuristic rules can demand changes, and LLM-aided systems in Fig.16 have
be easily described by natural language and instructions. the potential to generate novel heuristic algorithms with fast
For example, swarm-based methods are very widely used convergence and low computational complexity. Additionally,
heuristic algorithms, e.g., genetic algorithm, particle swarm such a scheme can also be used to generate new heuristic net-
optimization, and grey wolf optimizer, providing near-optimal work protocols or management policies, significantly saving
solutions by iteratively searching for better solutions. However, human efforts in terms of creation and design [30].
the number of these algorithms has grown significantly in
the past decade, and selecting the proper algorithm to solve F. Discussions and Analyses
specific network optimization problems has become more
difficult. Given the reasoning and understanding capabilities, Subsections VI-B to VI-E have introduced various LLM-
LLM offers promising changes for selecting and designing aided optimization techniques along with telecom applica-
novel meta-heuristic algorithms. tions. Table.IX summarizes various LLM-aided optimization
Fig.16 presents an example of using LLM to design novel techniques, including main features, prompt and fine-tuning
swarm-based meta-heuristic algorithms for network optimiza- requirements, advantages compared with existing approaches,
tion, which consists of 5 tasks. Such a decomposition and CoT and network optimization application opportunities. In the
approach can considerably lower the prompt difficulty and following, we summarize our key findings and analyses.
improve output performance [180]. The first step is to identify Firstly, task description is crucial for network optimization.
the key requirements of optimization tasks. For example, Task description is the first step of using LLMs, which requires
RISs consist of hundreds of small units, and each requires accurate and standard input, e.g., input and desired output
dedicated phase-shift control, leading to a large solution format, objective and specific rules. In addition, these tasks
space. Therefore, Prompt 1 in Fig.16 requires the candidate are usually closely related to telecom domain knowledge, and
algorithm to have “good exploration capabilities”, and the LLM may have difficulty understanding some professional
first instruction is to “Please list 5 candidate algorithms”. concepts. For example, in Section VI-E, the LLM may already
Then, the next task is to identify the key components of have some general knowledge of RIS technology, but they

34
Fig. 16. LLM-aided meta-heuristic algorithm generation.

are unable to directly understand the difficulty of RIS phase- transmission data rate and signal coverage. Handing these op-
shift control, which is a very professional domain-specific timization problems needs strong planning capabilities, which
knowledge. Therefore, the task description has to be care- is still a challenge for current LLM research fields.
fully designed, which will directly affect the LLM output.
Meanwhile, prompt design is the key to network optimization VII. T IME S ERIES LLM FOR P REDICTION P ROBLEMS
problems. Previous sections have demonstrated that prompt Prediction tasks are crucial in telecom networks that in-
is one of the most important approaches to take advantage volve predicting future trends, demands, and behaviours based
of LLM’s capabilities, and there have been various prompt on historical data, e.g., predicting network traffic, customer
engineering techniques, e.g., CoT [41], ReAct [186], zero-shot demand, equipment failures, and service usage. This section
instruction [195], etc. Therefore, understanding the function will introduce time series models for prediction problems in
of prompt engineering is crucial for applying LLM to solve wireless networks, including pre-training foundation models,
optimization problems. For instance, in reward design prob- frozen pre-trained, fine-tuning, and multi-modality LLMs.
lems, the feedback prompt is critical to improve the reward
design step by step. In the heuristic algorithm design problem A. Motivations
in Section VI-E, the output completely depends on the user Conventional prediction algorithms in the telecom domain
prompt input to the LLM agent. rely on statistical and time-series analysis to estimate the
In addition, several of the above optimization approaches output. However, telecom data is usually non-linear, non-
rely on the feedback mechanism, in which the solutions are stationary, and influenced by various external factors, leading
iteratively improved based on previous answers and environ- to challenges in capturing complex patterns and relationships.
ment feedback. For instance, the reward function design is While these traditional methods have been effective to some
iteratively improved by involving the evaluation results and extent, they may struggle with the complexity and dynamic
feedback prompts. Similarly, in verbal reinforcement learning, nature of telecom data. Recently, LLM technologies have
the LLM agent can adjust the action selections to obtain a shown promise in addressing the challenges of time-series
higher reward based on environmental feedback. Therefore, prediction due to their ability to handle complex data structures
the design of these prompts is crucial for improving LLM’s and adapt to changing patterns.
performance, e.g., dedicated feedback and evaluator prompt Firstly, LLMs provide a universal and generalizable model
designs. On the other hand, balancing exploration-exploitation for telecom network prediction. Given historical data, con-
is a common obstacle for many optimization problems. This ventional prediction approaches must train a new model to
problem becomes severe when the action space is larger, adapt to incoming target tasks. These methods usually require
which is very common in telecom networks. Therefore, how extensive feature engineering and manual tuning, which can be
to use the LLM’s self-improvement capability and meanwhile time-consuming and may not generalize well across different
balance the exploration-exploitation is very important. scenarios. By contrast, the versatility of LLMs makes them
Finally, note that many optimization problems can be very suitable for processing diverse forms of time-series data, and
complicated in wireless networks by involving multiple control such adaptability is crucial given the vast volumes of data
variables, network elements and layers. It may require step- generated in telecom. Moreover, LLM’s capability to contin-
by-step problem decomposition, formulating multiple objec- uously learn and adapt to new data patterns helps mitigate
tives, and alternating optimization, e.g., jointly optimizing the the concept drift problem, ensuring that the models remain

35
TABLE IX
S UMMARY OF LLM- BASED O PTIMIZATION T ECHNIQUES FOR T ELECOM .

LLM-based
Prompt/ Input Advantages compared Network optimization
optimization Main features Potential issues
requirements with existing approaches application opportunities
applications
Reinforcement learning is
Reward function is a 1) Automatic reward
Automatic reward function a very useful technique for
crucial part of design is still at a very
Task/environment design can significantly telecom network
reinforcement early stage, and there are
description; Objective save human effort in management, and
learning-enabled network few applications that
description; States and applying reinforcement automatic reward design is
LLM-aided optimization, and LLM explore such a novel
actions; Examples or learning to network a promising technique to
reward provides automatic technique in the telecom
demos. A mapping optimization tasks. enable artificial general
function design reward function design field; 2) The prompt has
function/criteria to Automatic reward function intelligence, which is
by using its to be carefully designed to
evaluate the design to design has produced particularly useful for
self-improvement and describe the target task,
”good”/”bad”. comparable performance small-scale optimization
understanding which is known as prompt
as human manual design. problems to save human
capabilities. engineering.
effort.
1) Self-evaluator will 1) Avoiding the difficulty 1) The evaluator and
Verbal reinforcement
It considers LLM as an provide critical of tuning hyperparameters self-reflection modules
learning can be very useful
agent, exploring the feedback to the actor like learning rate, batch have to be carefully
for solving problems that
environment and for improved size, and training designed to generate
have been well-defined
Verbal accumulating performance; 2) frequency; 2) Allowing for useful experience; 2) It
with small action spaces
reinforcement experiences. Using the Short-term and language instructions to may have
and immediate rewards,
learning self-improvement long-term memories guide network exploration-exploitation
which is very common in
capability to improve are crucial for the optimization policies; 3) difficulty, since the agent
telecom networks, e.g.,
previous solutions and actor to distinguish Providing Reasoning and relies on previous
resource allocation and
obtain a higher reward. between good and bad interpretable explanations experience to produce new
association.
actions. for algorithm performance. solutions.
Black-box optimization is
Black-box optimization Black-box optimization
1) Task description; 2) The performance of using a promising technique for
is a useful approach for avoids the complexity of
Previous input and an LLM black-box telecom network, but it
network optimization, building dedicated
LLM as a output examples, and optimizer cannot be may have difficulty
and LLM has been optimization models and
black-box then asking for a guaranteed, which relies providing stable and
demonstrated to have the transformations, which can
optimizer better solution based on the quality of the reliable results. The
black-box optimization be very time-consuming in
on previous input and provided input and output reasoning capability of
capability to fit an complicated telecom
output. examples. LLM may shed light on
unknown loss function. network environments.
solving this problem.
1) The problem has to
be defined in standard Some convex optimization
1) Automatic problem Many network control
LLM provides form, so then the LLM problems in the telecom
modelling is an especially problems can be
end-to-end automatic can understand and field are extremely
promising technique, formulated as convex
solutions for convex model it; 2) Telecom complicated with coupled
LLM-enabled significantly saving human optimization problems,
optimization techniques, knowledge and control variables and
convex effort; 2) It enables and LLM-aided convex
including problem formulation template highly non-convex
optimization automatic problem-solving optimization has great
modelling, code are required; 3) objectives and constraints.
in an end-to-end manner, potential to solve these
generation, and solver Existing solvers have These problems can be
requiring minimum human problems efficiently with
implementation. to be specified for the very difficult to solve
intervention. much less human effort.
LLM to solve the automatically.
problem.
Heuristic algorithms are
Heuristic algorithms are
inherently compatible It may require a series
Automatic heuristic The generated heuristic widely used for telecom
with LLM, since many of prompts in a CoT
algorithm selection and algorithms still need to be network optimization and
heuristic rules can be manner, including
design will considerably tested and verified. Such management, and LLM
LLM for easily described by candidate algorithm
save human time on an automatic design has promising potential for
heuristic natural language. LLM selection, analyses,
algorithm analyses and cannot guarantee the heuristic algorithm
algorithms offers opportunities for new algorithm and
design. It can also provide performance of the selection and design,
heuristic algorithm pseudo-code code
reasoning and analyses of algorithm that was producing novel
selection and design for generation, and
the generated results. produced. algorithms that can better
specific network reasoning.
serve telecom networks.
optimization tasks.

relevant and effective over time. As a result, the integration of and map the input-output relationships without extra model
LLM techniques in time-series prediction offers a promising training. Such a prediction method is much more efficient
avenue for developing more robust and generalizable models than conventional prediction methods. Meanwhile, it is also
that can better handle the complexities of data in telecom. more accessible since no professional knowledge of model
training/fine-tuning is required. In addition, multi-modal LLM-
Meanwhile, LLMs have excellent ICL capabilities, which
enabled prediction can also be combined with sensing in tele-
means that they can perform new tasks by leveraging con-
com networks. In particular, multi-modal LLMs can process
textual information in demonstrations. In particular, it means
and integrate information from various data types, such as text,
that the LLM can directly learn from the provided examples,

36
images, audio, and time-series data. In addition, sensing is an value xt are appended to the sequence’s end. This tokenization
important part of envisioned 6G networks, aiming to integrate mechanism effectively reduces the number of input tokens
environmental information into communication networks, e.g., from L to roughly L/S, which significantly diminishes the
the image captured by street cameras or satellites, 3D LiDAR memory space consumption and computational intensity.
maps and WiFi sensing. In the context of telecom prediction, 3) Model architecture: Most existing works employ either
multi-modal LLMs can combine sensing data with numeri- encoder-decoder or decoder-only architecture as the backbone
cal time-series data to generate more accurate context-aware model to train a time-series foundation model.
prediction, which can be particularly useful in 6G. Encoder-decoder: The encoder-decoder transformer archi-
Given the great potential, it is important to investigate time tecture stands out for its remarkable efficiency and efficacy,
series LLM techniques and applications in telecom networks. primarily attributed to its self-attention mechanism [50]. Fig.17
In the following, we will introduce various LLM-based pre- shows an example named TimeGPT that exemplifies the
diction methods and applications to telecom networks. application of the encoder-decoder transformer for prediction
problems [196]. In particular, TimeGPT inputs a historical
B. Pre-training Foundation Models for Zero-shot Prediction sequence of data points to predict future values. The inputs
The pursuit of training a general-purpose foundation model are added relative positional embedding, which demonstrates
for time-series data is driven by the desire to address the higher capability to handle long sequences than the original
inherent challenges associated with diverse and dynamic data. absolution positional embedding of the transformer [50]. Its
Traditional time-series methods may struggle to adapt to encoder captures temporal dependencies within the historical
the non-stationary properties of real-world data, where the context, encoding it into a latent space, while the decoder
statistical characteristics of the series can change over time utilizes this encoded information to predict future values. As
due to evolving patterns and trends. For instance, the network shown in Fig.17, with its specialized architecture, TimeGPT
traffic load level can be affected by many factors, including can address the intricacies of time-series data, such as trends
time, area, environment buildings, service types, etc. It usually and seasonality, which makes it an ideal model for telecom
requires dedicated model design and training for each target time-series predicting such as network traffic load, channel
task, and then extracts the underlying patterns from history state, user mobility, etc. Once pre-trained, such a universal
datasets [197]. By contrast, a general-purpose foundation model can be used for various prediction tasks without extra
model aims to overcome these challenges by leveraging the training. By contrast, conventional methods such as RNN and
advancements in LLM technologies. The following will first DNN are usually task-specific, and training a new model from
formulate the problem of training a foundational LLM for scratch for each incoming new task is time-consuming.
time-series prediction, and then discuss different tokenization Decoder-only: Even though encoder-decoder models ex-
mechanisms and model architectures. hibit impressive effectiveness for handling sequences, decoder-
1) Problem formulation: The primary goal of a founda- only models become more popular in recent years. Like the
tion model is to design a zero-shot forecasting scheme that encoder-decoder architecture, the decoder-only model must
utilizes the past t time points of a time series as input to first tokenize the raw inputs and then incorporate posi-
predict the future h time points. Let the input context be tional embeddings. The essential difference between encoder-
y1:L := {y1 , . . . , yL } and the prediction horizon be yL+1:L+H . decoder and decoder-only models is that the bidirectional
The model, denoted as fθ (parameterized by θ), aims to map attention is used by encoder [50], which means each token
the context to the horizon, i.e., fθ : (y1:L ) → ŷL+1:L+H . In is attending to all other tokens. In contrast, the decoder-only
this setting, the prediction model fθ maps the feature space model employs casual attention, where each token cannot
X to the dependent variable space Y. The spaces are defined attend to tokens after it, but can only look at tokens before it.
as X = {y[0:t] , x[0:t+h] } and Y = {y[t+1:t+h] }, where h is The causal attention mechanism enhances prediction models
the prediction horizon, y is the target time series, and x are because it is well-suited for time-series forecasting tasks.
exogenous covariates. The prediction task is to estimate the In time-series forecasting, we typically predict future values
conditional distribution: based on historical data. Causal attention allows each token
to consider all preceding tokens, meaning it attends to all
P(y[t+1:t+h] |y[0:t] , x[0:t+h] ) = fθ (y[0:t] , x[0:t+h] ) (1)
events that occurred before the current time frame. Moreover,
2) Tokenization mechanisms: Motivated by ViT [198], with these small modifications, the attention score matrices
many existing works use patching to convert the raw input in decoder-only models are triangle matrices, which always
sequences to tokens. In particular, each time series x[0:t] have full-column rank, resulting in better expressibility [199].
is segmented into a series of patches, which may overlap As shown by Fig.18, TimesFM [200] employs decoder-only
or be distinctly separate. The patch length is denoted as architecture to train a time-series prediction model. Unlike tra-
P , and the stride, representing the non-overlapping interval ditional LLM techniques, which predict one element at a time,
between consecutive patches, is denoted as S. Consequently, TimesFM is designed to predict extended future sequences in
this patching technique produces a sequence of patches xp ∈ a single step, enhancing accuracy for long-term predictions.
RP ×N, where
 N denotes the number of patches, calculated by This flexibility also extends to inference; given a series, the
N = L−P S + 2. Before patching, S repetitions of the final model can predict its immediate future in fewer steps than

37
Fig. 17. Encoder-decoder-based TimeGPT for prediction problems in telecom networks [196].

a model with equal-length input and output segments would


require. Such fast inference could be an appealing feature for
telecom applications, because many prediction tasks require
rapid response to network dynamics, such as channel state,
short-term traffic changes, and indoor user locations. Conven-
tional prediction methods usually take a long training time
to adapt to such network environment changes. By contrast,
TimesFM has the potential to capture short-term patterns
instantly, which aligns with the fast-response requirements of
telecom networks.
In summary, the key differences between encoder-decoder
and decoder-only architecture can be found by comparing
Fig. 17 and Fig. 18. In particular, the encoder-decoder design
in Fig. 17 includes an encoder to encode the raw features
into latent representations by using bidirectional attention. In
contrast, the decoder-only scheme in Fig. 18 illustrates causal
attention, e.g., the first token is attended by all other tokens,
and the second token is attended by all except the first token.
With a backbone model and an effective tokenization mech-
anism, one can train a time-series prediction foundation model
for telecom with a mixture of different datasets. Meanwhile,
understanding the tokenization and model architecture differ-
Fig. 18. A decoder-only model named TimesFM for time-series prediction
ences is crucial for designing and pre-training a time series proposed in [200].
LLM for telecom applications. For instance, the tokenization
mechanism introduced in the previous Section VII-B2 is very
useful for telecom applications, since telecom networks are
a general-domain LLM to prediction tasks. This section delves
associated with a large number of network devices and end
into using a pre-trained LLM for prediction tasks without
users, generating a huge number of datasets, such as histor-
the necessity for further fine-tuning. There are two primary
ical CSI, traffic load level [201], and network performance
approaches: prompting-based and preprocessing-based meth-
metrics [202]. Therefore, reducing the number of input tokens
ods. Specifically, the prompting-based methods include hard
can lower the pre-training difficulty of LLMs, especially
and soft prompts. Hard prompts employ rigid and predefined
considering that network edge devices usually have limited
textual structures to present time-series information in a format
computational resources.
that is intuitive for the language model. Conversely, soft
prompts adopt a more nuanced strategy by integrating trainable
C. Frozen Pre-trained LLM for Prediction
embeddings within the input that subtly guide the language
Rather than developing a specific LLM for prediction, model’s predictions. Meanwhile, preprocessing-based meth-
frozen pre-trained LLM refers to approaches that directly adapt ods aim to reformat the time series numerical values into

38
a representation that aligns more seamlessly with LLM’s
tokenization process, rather than introducing extra template
tokens or trainable embeddings.
1) Prompting-based methods In leveraging prompt engi-
neering, two predominant prompting strategies are utilized:
hard prompts [203] and soft prompts [204].
Hard prompts (Phard ) involve pre-pending a fixed textual
instruction and query to the input data sequence, or fit the raw
input data into a carefully designed template. In this way, we
can transform numerical values into textual contexts that can
be processed by pre-trained large language models. By lever-
aging the impressive generalizability of pre-trained LLMs,
hard prompting techniques can yield high prediction accuracy
in zero-shot settings. Specifically, the model input for a time-
series x[1:T ] with a hard prompt is thus formalized as the con-
catenation [Phard ; x[1:T ] ], which directs the model to generate
a prediction in response to the prompt. When designing hard
prompts for time-series prediction with language models, the
general guideline is to transform numerical data into a format
that mimics natural language constructs [203]. This involves
two main components: input prompts and output prompts.
Input prompts provide historical context and highlight the Fig. 19. The model framework of TIME-LLM with soft prompt [47].
target time step for prediction, while output prompts focus on
the desired prediction value, serving as the ground truth label
for training or evaluation. Table X presents several telecom a method called patching, which includes a specialized embed-
examples of designing hard prompts for specific tasks, such ding layer (patch reprogram), resulting in patch embeddings.
as network traffic load prediction, network user number predic- The pre-trained LLM then takes the concatenated prompt and
tion, and customer service prediction. The process mirrors the patch embeddings as inputs and produces outputs via an output
source/target structure common in machine translation tasks projection layer. Throughout this process, all parameters of
or can be likened to a question-answering setting, with the the pre-trained LLM remain frozen, requiring training only
context as background information and the question seeking for the custom embedding layer to connect the time series
future insights. Then the output prompt becomes the answer and textual data. When designing soft prompts for time-series
to this query, such as ”the number of active users, traffic prediction using a pre-trained LLM, there are a couple of
load level at specific times, and predicted customer calls next guiding principles and design choices. Unlike hard prompts,
week”. According to PromptCast [203], this simple approach soft prompts require no explicit textual additions to the input
achieves comparable or superior prediction accuracy across data. The general approach involves encoding time series
various datasets, demonstrating its effectiveness on bridging data into a format that the LLM can process, harnessing
the gap between raw numerical sequences and language-based its underlying capabilities to discern patterns and generate
data representations, further facilitating the use of language predictions. To utilize soft prompts prediction for telecom
models for prediction tasks traditionally handled by numerical efficiently, one might consider the specific characteristics of
methods. the telecom time-series data, such as traffic patterns or usage
Conversely, soft prompts (Psoft ) introduce trainable em- trends, to design the transformation and reprogramming steps
beddings that are optimized during training to influence the that align the time-series data with the model’s language
model’s prediction subtly [47]. The input for a soft prompt understanding capabilities.
is represented as [Psoft ; x[1:T ] ], where Psoft constitutes a series
2) Preprocessing-based Methods The preprocessing-based
of parameters that are fine-tuned to enhance the predictive
method leans on the LLM’s inherent ability to detect and
capability of the model. This adjustable approach allows
follow patterns within generalized sequences, devoid of re-
the model to internalize and apply nuanced guiding signals
liance on any specific language structure. In particular, when
without the rigidity of fixed textual cues. Fig. 19 shows an
numerical values are adeptly transformed into textual strings,
example of using soft prompts in [47]. TIME-LLM utilizes
prediction with the model adheres to standard language model
two types of inputs: a textual description of domain knowledge
sampling methods. Therefore, tokenization plays a pivotal role
with some in-context examples, and a time-series input. The
because it shapes the model’s perception of numerical patterns.
textual description is tokenized and processed through the
LLMTIME [45] proposes two ways to preprocess the raw data:
embedding layers of the pre-trained LLM to generate latent
representations, termed as prompt embeddings. When a time- • Introducing extra space: For example, GPT-3’s to-
series input is received, it is tokenized and embedded through kenizer might dissect the number 42235630 into

39
TABLE X
T HREE HARD PROMPT EXAMPLES FOR PREDICTION TASKS IN TELECOM .

Network Input prompt (source) From {t1 } to {tobs }, network {Um } experienced {xt1 :tobs } GB of traffic each hour.
traffic load Question What will the data traffic be on {tobs+1 }?
prediction Output (target) The network will experience {xtobs +1 } GB of traffic.
Network Input prompt (source) From {t1 } to {tobs }, the BS had {xt1 :tobs } active connections each day.
users Question What will the BS utilization be on {tobs+1 }?
prediction Output (target) The BS will have {xtobs +1 } active connections.
Customer Input prompt (source) From {t1 } to {tobs }, customer service received {xt1 :tobs } calls each week.
service Question How many service calls will be received in the week of {tobs+1 }?
prediction Output (target) There will be {xtobs +1 } service calls received.

[422, 35, 630], which complicates arithmetic operations. The matrices A and B are the parameters learned during fine-
To address this, a preprocessing step is introduced where tuning while the original weights W are kept frozen. This
digits are separated by spaces, and time steps by commas, results in a model that is fine-tuned for the task at hand with
ensuring uniform tokenization of each digit: ”4 2 2 3 only a small increase in the number of parameters. Many
5 6 3 0”. With this small change, the tokenizations are works of fine-tuning LLMs proposed applying the technique
completely different. Each digit now is processed by the to the query (Q) and value (V) matrices in attention layers,
model individually. showing notable results without extending it to all parameters
• Eliminating decimal points and rescaling Given a within the attention or feed-forward layers. In the context
fixed precision, the decimal points are redundant and of time-series prediction, however, this selective fine-tuning
unnecessary. Decimal points are excluded under fixed may require adjustment. As shown in Fig. 20, LLM4TS [205]
precision to optimize context length, transforming a series applies LORA fine-tuning to the query (Q) and key (K),
”0.123, 1.23, 12.3, 123.0” into ”12, 123, 1230, 12300”. It achieving state-of-the-art performance. It augments the pre-
provides a straightforward approach to processing the trained model with additional trainable components and thus
inputs. incorporates changes to the model’s weights, unlike the soft
In terms of telecom application potentials, these two prepro- prompting to modify the inputs. Using LORA allows for
cessing techniques provide a simple but efficient approach retaining the general capabilities of the LLM while imbuing it
to using LLM techniques for prediction. They eliminate the with domain-specific knowledge, ensuring that the time-series
need for careful designs of prompts, which can better adapt prediction model is both specialized and robust.
to various prediction tasks in telecom. Preprocessing-based
On the other hand, LNT offers a focused approach to adapt
methods have the potential to generate prediction results
pre-existing parameters in transformer blocks to specific tasks.
instantly based on given raw network data input.
LNT specifically targets the affine transformation parameters
within the layer normalization components of a transformer
D. Fine-tuned LLM Prediction model. These parameters, such as scale and shift, originally
Fine-tuning pre-trained LLMs presents a significant ad- set to ensure standardized input distribution across network
vancement for time-series prediction, offering a powerful layers, become trainable to allow the model to retain its learned
alternative to traditional prediction approaches [205], [206]. representations while fine-tuning the time-series prediction. As
General-domain LLMs, initially pre-trained on extensive lin- shown in Fig. 20, LLM4TS [205] employs both LNT and
guistic data, can be fine-tuned to capture the unique temporal LORA fine-tuning for the query and key. A similar strategy
patterns inherent in time-series data. This process equips can be found in [206], which freezes all attention and feed-
LLMs with the ability to effectively prediction in domains forward layers, and only fully fine-tunes the embedding layers
where data scarcity or specificity presents challenges to and applies the LNT. Incorporating LNT in the fine-tuning
conventional deep learning models. In the pursuit of effi- process, in the context of adapting pre-trained LLMs for time-
ciency and practicality, most recent works have shifted to- series prediction, provides a mechanism for the model to adjust
wards parameter-efficient fine-tuning methods like Low-Rank its internal normalization to better fit the dynamics and scale
Adaptation (LORA) [207] and Layer Normalization Tuning of the time-series data.
(LNT) [208]. In particular, LORA adapts pre-trained models
These parameter-efficient fine-tuning methods, such as
to new tasks by modifying the weight matrices of the model’s
LoRA and LNT, are crucial for the practical deployment of
layers. Given a weight matrix W ∈ Rd×m in a pre-trained
LLMs in telecom. Lin et al. claims that applying LoRA to
model, LORA fine-tuning introduces two low-rank matrices
GPT-3 can reduce the number of trainable parameters from
A ∈ Rd×r and B ∈ Rr×m , where r is the rank and
175.2 billion to 37.7 million [16], and combining LoRA with
r ≪ min(d, m). The weight matrix W is updated as:
federated split learning can significantly reduce computing and
W ′ = W + ∆W, where ∆W = AB. (2) communication latency at the mobile edge.

40
historical data. Most existing studies consider single modality
input, which usually consists of tabular-based numerical data
such as historical CSI. However, the real-world environment
can be more complicated, and CSI may be affected by many
other factors such as weather conditions and dense buildings
[5], [211]. These multi-modal inputs, such as weather maps
and building distributions, can provide a more comprehensive
understanding of the signal transmission environment, but
jointly processing these inputs is beyond the capabilities
of existing techniques. Multi-modal LLMs offer promising
solutions by jointly considering diverse modalities and data
sources, producing more accurate CSI prediction results. In
addition, users can provide textual prompt instructions, which
are easy-accessible and user-friendly for non-researcher users.
2) Prediction-based mmWave/THz beamforming: The in-
creasing traffic demand and limited bandwidth resources make
mmWave and THz communications promising techniques.
However, these high-frequency transmissions are highly di-
rectional and vulnerable to signal blockages. Consequently,
efficient beamforming and alignment are required to achieve
reliable mmWave and THz networks. For instance, Ke et
al. [212] applied a Gaussian process-based ML scheme for
UAV position prediction and UAV-mmWave beam-tracking,
Fig. 20. The model framework of LLM4TS framework [205]. Q, K, V are and Shah et al. [213] deployed LSTM networks to predict
the query, key, value vectors respectively. Wq , Wk , Wv are the matrices used multiple mmWave beams from multiple cells. These predic-
for generating query, key and value vector. tions usually consider numerical input, especially historical
data [212]. Charan et al. [25] introduced computer vision-
aided techniques for signal blockage prediction, using cameras
E. Multi-modal LLM for Telecom Prediction on BSs to capture possible blockages and then initiate user
Multi-modal learning is a promising feature of LLM tech- hand-off beforehand. However, it is still limited to a single
niques, aiming to process related information from multiple image modality with limited environmental information. By
modalities, such as text, audio, image, video, 3D maps, graphs, contrast, multi-modal LLMs can take holographic input from
etc [209]. A multi-modal LLM can use diverse encoders to the environment, and jointly consider historical tracks, instant
extract features from different modalities into desired outputs, images, and text instructions, etc. These comprehensive inputs
indicating a more comprehensive and flexible approach to can produce more accurate and reliable prediction results,
process information. Such multi-modal capabilities can be contributing to efficient mmWave and THz beamforming.
particularly useful for integrating sensing and communication, 3) Traffic load prediction: Accurate traffic load prediction
which is a crucial technique in 6G networks. In particular, as is the prerequisite of efficient network management, which
shown in Fig. 21, LLMs can include multiple inputs with di- is related to user numbers, service types, time periods, and
verse modalities, e.g., the image captured by satellite or street so on. Similar to CSI prediction, most existing studies take
cameras, 3D LiDAR maps and videos collected by vehicles. numerical datasets as single modal input [214], [215]. For
Sensing has become a critical part of envisioned 6G networks, instance, Alekseeva et al. [214] compared the performance
and multi-modal LLMs are capable of making the most of of seven ML algorithms (including Bagging, Random Forest,
the collected sensing information. On the other hand, LLMs Gradient Boosting, Linear Regression, Bayesian Regression,
can also include conventional tabular-based numerical input, Huber Regression, and SVM Regression) on the task of traffic
and further consider textual input and prompt instructions. load prediction. Their findings indicate that Boosting-based
With multi-modal inputs, LLM agents can better understand methods demonstrate superior performance when handling
the surrounding environment and then make more accurate large volumes of load data, yet incurring significant training
predictions for network dynamics. costs. Hu et al. [215] integrated a sequence of AutoEncoders
1) Channel state information (CSI) prediction: CSI to extract multiple sets of latent temporal features from
plays an increasingly vital role in wireless networks, enabling historical load data for load prediction, which ensures that
the transmitter to adjust the transmission parameters based extracted feature sets are representative of the entire load data.
on current channel conditions and, therefore, achieve better Most existing studies mainly consider two factors: the spatial
performance. Prediction-based methods are appealing methods correlation between nearby BSs and the temporal dynamics
to obtain instantaneous CSI. For instance, Jiang et al. [210] captured in historical data. However, Abdullah et al. [216]
applied deep learning for CSI prediction using generated or suggested that various meteorological factors, such as rain,

41
Fig. 21. Multi-modality LLM for prediction problems in wireless networks.

wind, and temperature, can also significantly influence the datasets are prerequisites of pre-training Time-LLM, and then
volume of traffic loads. Consequently, multi-modal LLMs may the LLM can understand and capture the hidden patterns of
be used to harness diverse information streams, including the input data. Despite the importance, collecting such datasets
spatial BS corrections, temporal historical traffic loads, and can be difficult in telecom due to various data formats and
environmental factors, facilitating accurate load prediction and sources, different network operators, customer privacy, etc.
providing effective network management and service delivery.
4) Quality of Experience (QoE) prediction: QoE is a Secondly, prompting and preprocessing-based methods are
measure of the customer’s experiences of specific services, the most efficient approaches to using LLM for prediction
which is a useful metric in diverse mobile scenarios, such tasks. Compared with pre-training and fine-tuning, the dis-
as mobile edge computing [150], edge caching [217], and cussions in Section VII-C demonstrate that prompting is one
resource allocation [218]. QoE is closely related to the user’s of the most straightforward methods of using an LLM for
natural language comments. An example is given by [150]: “I prediction tasks. Such an advantage can still be explained by
am having the same issues as everyone else...Phone shows 5 LLM’s impressive zero-shot learning capabilities. In addition,
bars on 4G.” Makes calls and texts just fine but no imessage preprocessing input data is another simple method. Trans-
or internet (safari as well as any other apps that require forming numerical values into textual strings can make the
connectivity). Right now the two things I have noticed are most of LLM’s capabilities in processing standard language
that I’m more likely to have it work late at night (11pm- tasks. These two methods are particularly useful for short-term
2am) and more likely to have it work when I’m outdoors...” prediction problems in telecom with instant responses.
Most existing studies predict QoE by extracting the key
attributes of users, devices, applications, and networks for In addition, previous sections also show that parameter-
modeling and measuring. However, this comment indicates a efficient fine-tuning methods are critical for LLM deployment
specific network issue ”no imessage or internet at midnight”. in telecom. Section VII-D introduced two parameter-efficient
Extracting such an informative and specific user experience methods, LoRA and LNT, to fine-tune LLM for prediction
to several attributes could lead to considerable information tasks. Efficient fine-tuning methods can improve overall com-
loss, and therefore the service provider cannot fully understand puting efficiency, lowering the demand for computational re-
the user’s demand. With multi-modal LLMs, user’s textual sources. These features are very useful for processing various
comments and network numerical metrics can be jointly eval- tasks ranging from generation and classification to prediction
uated, providing a comprehensive evaluation of the network problems in the telecom field such as in Section VII-D, espe-
performance and user experiences. In addition, multi-modal cially considering limited computational and storage resources
LLMs can also be used to generate and predict user experience at the network edge.
using LLM’s comprehension and reasoning capabilities.
F. Discussions and Analyses Finally, multi-modal LLM has great potential for telecom
Table XI summarized the LLM-enabled prediction tech- applications. Incorporating multi-modality LLM into telecom
niques in terms of main features, input and fine-tuning re- has been discussed in multiple existing studies [30], and
quirements, advantages, and telecom prediction application Section VII-E investigates the potential for telecom prediction
opportunities. We summarize the key findings as follows. problems such as CSI prediction, prediction-based beamform-
Firstly, large-scale time-series datasets are important for ing, and QoE prediction. The key motivation is that multi-
building Time-LLM for telecom. Previous sections have modal information from multiple sources can contribute to
demonstrated LLM’s potential for solving time-series predic- prediction accuracy, and such enhancement can further im-
tion problems. However, it is worth noting that time-series prove the network operator’s decision-making.

42
TABLE XI
S UMMARY OF LLM- BASED P REDICTION FOR T ELECOM .

LLM-based
Input and fine-tunning Advantages compared with Telecom prediction application
prediction Main features
requirements conventional approaches opportunities
techniques
Leverages pre-trained models Requires a large corpus of The model has zero-shot
on diverse time series datasets time-series data for initial prediction capabilities, and A prediction foundation model for
to capture general temporal pre-training, but collecting quickly adapts to new telecom can handle various short-term or
Pre-training
patterns. It means training these datasets may be difficult tasks with minimal long-term prediction tasks, such as traffic
foundation
LLMs from scratch in telecom; fine-tuning may be fine-tuning; captures a load prediction, CSI prediction, user
models
specifically for prediction needed for specific telecom wide range of temporal number estimation, and so on.
purposes. tasks. dynamics.
Tokenization and embedding
Low computational cost This technique is particularly useful for
Using pre-trained LLMs of time series data; For
and design complexity; short-term prediction, such as short-term
Frozen without fine-tuning their prompt-based methods, the
Leveraging the traffic load and network performance
pre-trained parameters, including prompt format must be
generalization capabilities prediction. The low computational cost
LLM prompting-based and carefully designed; No
of a pre-trained LLM can also adapt to network edge, even
for prediction preprocessing-based methods. fine-tuning for LLMs is
directly. mobile applications.
required.
Fine-tuning LLMs can better adapt to
Fine-tuning can
Adapts a pre-trained LLM to It requires time series data for specific tasks in telecom, e.g., collecting
incorporate telecom
telecom-specific prediction fine-tuning; may require specific datasets to fine-tune an LLM for
Fine-tuned domain knowledge into
tasks through fine-tuning parameter-efficient fine-tuning user localization. It is more flexible than
LLM LLMs, improving the
techniques such as LoRA and like LoRA and LNT to pre-trained models from scratch, and
prediction accuracy and specificity of
LNT. improve the efficiency. more reliable than pure prompting-based
telecom tasks.
methods.
It requires multi-modal input Multi-modal can take
Sensing is an important part of 6G
Using LLMs to jointly and proper prompt to predict advantage of inputs with
networks, and multi-modal sensing can
consider multi-modal desired output; LLMs can be various modalities. With
provide more comprehensive input for
environment information, e.g., specifically these comprehensive
Multi-modality LLMs, producing more accurate
tabular data, text, and image, pre-trained/fine-tuned to inputs, LLMs can better
prediction prediction results by jointly considering
aiming to provide more further improve the predict the network
various inputs, e.g., more accurate traffic
accurate prediction results. performance and dynamics than existing
load prediction and beam steering.
generalization capabilities. methods.

VIII. C HALLENGES AND F UTURE D IRECTIONS OF [13], and telecom question answering dataset [115]. However,
LLM- EMPOWERED TELECOM these datasets are usually small-scale and task-specific, and
This section will introduce the challenges of realizing LLM- a comprehensive large-scale dataset should include network-
empowered telecom, including telecom-domain LLM training, related documents, standard specifications, protocols, text-
practical LLM deployment in telecom, and prompt engineering books, research papers, and other relevant sources. Maatouk
for telecom applications. Then, we identify several future di- et al. started the exploration in [125] by building a dataset
rections, e.g., LLM-enabled planning, model compression and with 10000 telecom-related questions and answers, including
fast inference, overcoming hallucination problems, retrieval around 25000 pages and 6 million words. More efforts are
augmented-LLM, and economic and affordable LLMs. needed to provide more comprehensive and diverse datasets
for telecom LLM training.
A. Challenges of Applying LLM Techniques to Telecom Meanwhile, it is worth noting that telecom networks involve
1) Telecom-domain LLM training: Previous sections a large number of various concepts such as network protocols,
have shown the importance of building telecom-specific routing algorithms, network topologies, network security, etc.
LLMs, e.g., telecom-domain question answering [115], tele- Therefore, teaching an LLM to comprehend and reason about
com troubleshooting [23], and standard specification classi- these complex concepts requires a robust training strategy.
fication [13]. Despite the great potential, training an LLM An effective approach is to pre-train the LLM on a large-
specifically for telecom presents unique challenges due to the scale general language corpus and then fine-tune it on specific
complex nature of telecom networks. In the following, we will communication network datasets, e.g., datasets for the BS
analyze this challenge in detail. services, historical datasets for prediction, or datasets for edge
Sufficient telecom-related datasets are prerequisites for computing-related tasks. In addition, balancing model size and
training a telecom LLM. Unlike general-domain LLMs, which performance is crucial. LLMs trained on large-scale datasets
can leverage large-scale text corpora on the internet, obtain- tend to be computationally expensive and memory-intensive.
ing a sizable dataset exclusively focused on communication Appropriate model size can reduce the burden on energy
networks can be challenging. Existing studies usually focus and computation resources during pre-training and fine-tuning
on one specific task and then build the corresponding dataset, phases. In addition, balancing model size and performance
e.g., the trouble report dataset [23], 3GPP specification dataset is crucial to ensure practical usability, especially considering

43
TABLE XII
S UMMARY OF T ELECOM DATASETS FOR LLM

Dataset Task Document size Question size Open-source


5GSum [219] Summarization 713 articles N/A Yes
Tspec-LLM [220] Summarization 30,137 documents 100 questions Yes
TeleQnA [125] Question answering N/A 10,000 questions Yes
NetEval [221] Question answering N/A 5,732 questions Yes
TeleQuAD [115] Question answering N/A 2,021 questions No
StandardsQA [222] Question answering N/A 2,400 questions No
ORAN-Bench-13K [223] Question answering 116 documents 13,952 questions Yes
5GSC [219] Sentence classification 2,401 sentences N/A Yes

scenarios with limited computational capacity such as vehicles tem latency, ranging from 0.58 to 90 seconds. Therefore, the
and mobile phones. Therefore, techniques like model compres- service time should be very carefully evaluated before using a
sion, knowledge distillation, or utilizing specialized hardware LLM for latency-critical applications. Network edge provides
accelerators can be explored to reduce the model’s size and an efficient approach for computational task processing, and
enhance its efficiency without compromising its understanding edge intelligence has become an appealing direction to deploy
of communication networks. ML algorithms in telecom networks. However, network edge
The above analyses show that obtaining domain-specific servers have limited computational or storage capacity, and
datasets is one of the main bottlenecks of training telecom LLMs are usually computationally intensive with large model
LLMs. Several datasets have been released recently to address sizes, which may prevent edge-LLM deployment.
this challenge. As shown in Table XII, there exist different To this end, hybrid deployment can be an ideal solution by
types of datasets, focusing on tasks including text summariza- combining central cloud, edge, and user device deployments,
tion, question answering, and sentence classification. Those providing a balance between scalability, low latency, and
datasets are mainly extracted from telecom-related documents. privacy. Deploying LLMs at different levels, including the
By integrating these datasets, LLMs can be trained to un- central cloud, network edge, or user devices, offers unique
derstand and respond to customer queries more effectively, opportunities and challenges in telecom applications. For ex-
predict and mitigate network issues, and identify fraudulent ample, large-scale LLMs such as GPT-4 and LLama3-70b are
activities with higher accuracy, ultimately leading to improved deployed at the central cloud to handle tasks with high-quality
operational efficiency and customer satisfaction in the telecom requirements on the generated content. Meanwhile, small-
sector. scale LLMs are implemented at the network edge or even on
2) Practical LLM deployment in telecom : To leverage user devices for latency-sensitive tasks. However, coordinating
the benefits of LLM techniques, the models should be properly LLMs at different levels can be challenging and requires
deployed in telecom networks. Specifically, LLMs can be dedicated designs. For example, how to select appropriate
deployed at different levels, including central cloud, network LLMs for diverse user tasks such as lower latency and price,
edge, or user devices. We have introduced the features of each higher generation quality, or multi-modal tasks.
approach in Section III-F. However, the related studies are 3) Prompt engineering for telecom applications : Prompt
still in very early stages, and the proposed schemes mainly engineering is a crucial aspect of utilizing LLM techniques
focus on system-level design and definitions. The following effectively, as it plays a significant role in guiding the model’s
will discuss the key challenges and difficulties for practical behaviour and generating desired outputs. Specifically, prompt
LLM deployment in telecom networks. engineering refers to the process of structuring an instruction
Firstly, many real-world wireless applications have stringent that can be interpreted and understood by generative AI
requirements for service delay, e.g., autonomous driving and models, and then produce the desired output. For instance,
robotic control. With such time constraints, using a central few-shot learning can be considered one of the prompt engi-
cloud-based LLM to process these latency-critical tasks can neering approaches, in which LLMs can learn from examples
be inappropriate, since the task uploading and solution down- and demonstrations to improve their performance on target
loading may increase the service delay. Additionally, if the tasks. Compared with pre-training or fine-tuning an LLM,
task involves image and video processing, the uploading and prompting has a much lower cost on computational resources.
downloading process will significantly increase the latency, In particular, it relies on LLM’s inference capabilities, indi-
especially considering the limited backhaul capacity. For in- cating the most straightforward and efficient approach to use
stance, the image classification tasks introduced in Section LLMs. The high efficiency of prompting techniques aligns
V-D require rapid responses for signal blockage prediction and well with many telecom applications, which usually require
autonomous driving [25] [161], and processing these require- fast responses to network dynamics, e.g., changing channel
ments on cloud can be impractical due to high service latency. conditions, user numbers, network traffic level, etc. However,
In addition, the LLM inference time will also contribute to sys- designing prompts for telecom applications presents unique

44
challenges due to the domain complexity. VII-E, and it aims to predict signal transmission blockage by
Firstly, telecom networks encompass a wide range of con- using multi-modal input such as image and video. Meanwhile,
cepts, protocols, and technologies, making it challenging to with multi-modal information of the 3D environment, we can
distill the necessary information into a concise prompt. The also have better CSI estimation results, which is critical for
diverse nature of the domain requires a deep understanding signal transmission and network management [30]. In addition,
of networking principles and the ability to capture specific Xu et. al also introduced an example in [17], which utilizes
nuances related to network architectures, protocols, perfor- LLMs to generate a traffic accident report by using videos
mance optimization, and security. To design effective prompts, collected by vehicles. This video-to-text generation can also
researchers must identify the most relevant components and be used to analyze the videos collected by UAVs to describe
provide concise yet comprehensive instructions to LLMs. the wireless signal transmission environments.
Meanwhile, prompt designs should strike a balance between 2) LLM-enabled planning in telecom: Multi-step plan-
being specific enough to guide the LLM in generating accurate ning and scheduling are critical for handling many tasks
and contextually appropriate responses, while also remaining in the telecom field. For instance, Section IV-C2 has intro-
general enough to handle a wide range of network-related duced an example of coding wireless projects with step-by-
queries or tasks. Achieving this balance is crucial as overly step prompting. Meanwhile, many optimization problems with
specific prompts may limit the model’s ability to generalize, multiple network elements and control variables have to be
while a general prompt may lead to irrelevant responses. solved by dedicated planning [7]. However, recent benchmarks
Moreover, telecom tasks often require the LLM to consider have shown that LLMs struggle with tasks requiring complex
contextual information and situational variables. For example, planning and sequential decision-making, which may prevent
network troubleshooting may involve analyzing network logs, the direct application to many telecom tasks. Some existing
diagnosing performance issues, or identifying security vulnera- studies such as [14] and [18] propose to improve the multi-step
bilities. Designing prompts that take into account the relevant planning capabilities by step-by-step and CoT prompting. De-
context and guide the model to consider appropriate factors spite the satisfactory performance in [14] and [18], they require
can significantly enhance the accuracy and relevance of the dedicated analyses to manually decompose a complicated
generated responses. Techniques like providing explicit con- task into multiple sub-tasks. Therefore, future studies should
text cues or utilizing conditional generation can be explored. aim at developing better algorithms for planning that can be
To summarize, prompt design of LLMs for telecom appli- integrated into LLMs, and such multi-step planning capability
cations poses a significant challenge due to the intricate and is crucial for solving telecom-domain tasks. This might involve
constantly evolving nature of the domain. Crafting effective incorporating structured reasoning and problem-solving frame-
prompts necessitates a profound comprehension of networking works into the models, enabling them to break down tasks into
principles, the capacity to strike a balance between specificity smaller and more manageable sub-tasks. Therefore, automated
and generality, and an awareness of contextual factors. To this task decomposition can be an attractive solution to improve
end, a practical solution is to publicize some standard prompt- the planning performance of LLMs. However, automatically
ing templates. They will provide fundamental suggestions for decoupling one complicated task into multiple sub-tasks is still
prompt designs of each kind of task. For instance, network very challenging in the telecom field. Additionally, another
optimization-related tasks should specify the optimization ob- solution could be integrating simulation environments directly
jective and control variables, providing feedback or examples within the training process, allowing models to practice and
for previous selections. In addition, researchers and industry refine their planning skills in a controlled setting before
experts play a crucial role in developing these standards, applying them to real-world tasks. It allows the LLMs to
since the template design requires professional knowledge and improve the planning performance by trial-and-error before
understanding of telecom networks. Then these templates will applying it to telecom tasks.
be able to deliver precise, pertinent, and impartial responses 3) LLM for resource allocation and network optimiza-
in telecom applications. tion: Resource management is a fundamental and crucial
problem for network operation, e.g., transmission power and
B. Future Directions bandwidth resources allocation. The above Section VI intro-
1) Multi-modal LLMs for telecom: Multi-modality is a duced various LLM-enabled optimization techniques for tele-
crucial direction for LLM development, indicating seamless com applications, including LLM-aided reinforcement learn-
integration of information with various modalities such as text, ing, black-box optimizer, LLM-aided convex optimization and
image, audio, video, etc. Such a capability may serve many heuristic algorithm design. These analyses of existing studies
applications in future telecom networks. For instance, sensing have revealed the potential of using LLM to optimize network
has become a critical pillar for envisioned 6G networks, and performance. For instance, verbal reinforcement learning can
multi-modal LLMs can utilize 3D multi-modal data, e.g., take the network operator’s human language instructions as
text, satellite or street camera images, 3D LiDAR maps and input to improve task performance, and LLMs can design
videos, to provide a holographic understanding for wireless novel heuristic algorithms based on specific task demands
signal transmission environment [30]. A specific example of for network resource allocations. LLM-based optimization has
mmWave/THz beamforming has been discussed in Section two crucial advantages: Firstly, LLM can integrate human lan-

45
guages into the optimization procedure, which makes network the key bottlenecks of applying LLMs to the telecom do-
management more accessible with much lower complexity; main, leading to stringent requirements for computational and
Secondly, LLM can provide reasons and explanations for storage capacities. Therefore, compressing the model size to
their decisions, and this capability is crucial for understanding adapt to network edge and mobile applications becomes a
complicated systems such as telecom networks. Despite the promising direction. In addition, it will also contribute to
advantages, it is worth noting that some network optimization the fast inference of LLMs, since many wireless applications
problems can be extremely complicated with coupled control require rapid response time and low latency. For instance,
variables and correlated network elements. Solving these op- Xu et al. proposed an on-device inference model specifically
timization tasks may require dedicated design and multi-step designed for efficient generative natural language processing
scheduling, which is still a challenge in the LLM field. tasks [112], achieving a 9.3× faster generation speed. Such
4) LLM-enhanced machine learning for telecom: ML a technique can be very promising for LLM-enabled mobile
algorithms have been widely applied to wireless networks and applications in telecom, enabling faster response time for user
achieve satisfactory performance. For example, reinforcement inquiries. Meanwhile, it is worth noting that compressing the
learning is one of the most widely used ML techniques for model size may degrade the LLM performance, and how
network optimization, and deep neural networks have been to balance the model size and performance requires more
extensively explored to predict CSI. Section VI-B introduced research efforts. It calls for novel model compression and
LLM-enhanced reinforcement learning by automating the re- pruning techniques to reduce the storage and computation bur-
ward function design, indicating a promising direction to dens at the network edge; on the other hand, standard metrics
explore LLM-enhanced ML algorithms. For instance, Sahu et must be defined to evaluate the performance of LLMs in the
al. investigated LLM-aided semi-supervised learning for the telecom domain, e.g., accuracy and hallucination probability
task of extractive text summarization, in which they proposed as we introduced in previous Section III-E.
a prompt-based pseudo-labelling strategy with LLMs [224]. 7) Overcoming hallucination problems in telecom ap-
Multi-agent learning also has many crucial applications in plications: Hallucination, or the generation of factually in-
wireless networks, and existing studies show that LLM-based correct or nonsensical information, remains a significant issue
multi-agents have many promising features [225]. In summary, for LLM applications. Specifically, it means that the LLM
LLM brings new opportunities to make conventional ML may generate some nonsensical answers or solutions for the
algorithms more accessible and explainable when applied to given telecom task. Hallucination can severely undermine the
telecom networks. reliability and credibility of LLM-generated content, degrading
5) Real-world implementations of LLM for telecom in- the performance on many downstream tasks. For instance,
dustry: The study in [106] by Apple introduced a method for a nonsensical answer may be generated when using LLM
efficiently running LLMs on devices with limited DRAM ca- for telecom question answering. Overcoming these issues is
pacity. This advancement is particularly beneficial for telecom- critical for telecom applications to guarantee network service
specific applications that rely on on-device LLM. By bringing quality and reliability. To this end, future research should
LLM capabilities onto the device, Qualcomm also integrates focus on developing methods to reduce hallucination and
on-device AI into smartphones to provide faster and more improve the factual accuracy of model outputs. This could
personalized services without relying on cloud-based solutions include enhancing the training datasets with more verified
[107]. Qualcomm aims to improve user privacy and reduce and reliable sources, implementing post-generation verification
latency, enabling applications such as real-time language steps, or incorporating cross-referencing mechanisms within
translation, advanced camera features, and direct contextual the model. Additionally, exploring the use of external knowl-
assistance on the user’s device. On-device AI also allows edge bases and real-time fact-checking during the generation
for continuous operation without internet dependency, which process could help mitigate this issue. Recently, it has been
is crucial for maintaining service quality in areas with poor demonstrated that under specific evaluation conditions, LLMs
connectivity. Additionally, solutions such as Kinetica’s SQL- exhibit exceptional zero-shot capabilities in assessing factual
GPT enable telecom professionals to interact with data using consistency [227]. This underscores their potential to become
natural language queries, converting these queries into SQL for leading evaluators of hallucination in various contexts. More-
quick and effective analysis [226]. This approach democratizes over, techniques such as adversarial testing can also help in
access to data insights, empowering employees to make faster assessing their susceptibility to hallucination, where models
and more informed decisions. These applications demonstrate are deliberately presented with complex or misleading inputs.
the transformative potential of LLMs in the telecom industry, 8) Retrieval augmented-LLM for telecom: Retrieval aug-
enhancing security, operational efficiency, and customer expe- mentation is an important direction for LLM development,
rience. By continuing to innovate with LLMs, telecom com- which retrieves facts from an external knowledge base to
panies can stay ahead in an increasingly AI-driven landscape, ground LLMs on the most up-to-date information. Telecom
providing superior services and maintaining a competitive networks are constantly evolving and updating, and retrieval
advantage. augmentation has great potential for telecom applications. In
6) Model compression and fast inference for network particular, retrieval-augmented LLM can improve the quality
edge and mobile applications: The model size is one of and relevance of the generated responses since the LLM has

46
access to more accurate and relevant information. However, IX. C ONCLUSION
current retrieval-augmented generation models increase the
Recently, large language models (LLMs) have shown great
context length, which in turn decreases the efficiency of
promise in many fields, specifically for language-related tasks
the model due to the added computational cost, which may
such as summarization and question and answering. LLM-
lead to severe slow-response issues. Such slow response may
based solutions have also been primarily investigated in the
increase the overall network latency and degrade the service
telecom field. In this work, we aim to present a comprehensive
quality. It may prevent the application of some scenarios
survey on LLMs for Telecom. In particular, we first introduced
with tight delay budget, which is very common in wireless
the LLM fundamentals. We present a comprehensive overview
networks. Therefore, future research could focus on improving
of the model architecture, pre-training, fine-tuning, inference
the efficiency of retrieval-augmented generation by optimizing
and utilization, evaluation, and deployment of LLM-based
retrieval mechanisms to balance context relevance and length.
solutions. Then, a comprehensive survey of existing works
This could involve developing more advanced indexing and
on the key techniques and applications in terms of genera-
search algorithms that require less memory. Additionally, dy-
tion, classification, optimization, and prediction problems is
namically adjusting the amount of retrieved information based
presented. These investigations and analyses have proven that
on the query’s complexity could help maintain or improve
LLMs have outstanding potential to bring artificial general
efficiency without sacrificing the quality of the output.
intelligence to the telecom field using in-context and zero-
9) Economic and affordable LLMs: Despite the great
shot learning capabilities. Finally, we discussed the key chal-
potential and advantages, training an LLM can be financially
lenges, such as data sets and cost, as well as future research
expensive. For instance, the training expenses for GPT-4
opportunities of LLM-empowered telecom. We hope this work
exceeded $ 100 million, and the LLaMa2 70B model was
can serve as a good reference for researchers and engineers
trained on 2048 GPUs A100 for 23 days with $ 1.7 million
to better understand the existing works, potentials, challenges,
estimated cost [228]. Although training some smaller models
and opportunities of applying LLMs for the telecom field.
such as LLaMa2 7B can be much cheaper, the affordability of
LLM techniques is still one of the main concerns. For instance, R EFERENCES
the study in [229] shows that using GPT-4 to support customer
service can cost more than $ 21,000 per month for a small [1] Z. Zhang, Y. Xiao, Z. Ma, M. Xiao, Z. Ding, X. Lei, G. K. Kara-
giannidis, and P. Fan, “6G wireless networks: Vision, requirements,
business. Meanwhile, there are many LLM APIs with various architecture, and key technologies,” IEEE Vehicular Technology Mag-
prices, including the prompt cost proportional to the prompt azine, vol. 14, no. 3, pp. 28–41, 2019.
length, generation cost related to the generation length, and [2] I. T. U. (ITU), “IMT towards 2030 and beyond,” 2023.
[Online]. Available: https://www.itu.int/en/ITU-R/study-groups/rsg5/
a possible fixed cost per query. For example, it costs $30 for rwp5d/imt-2030/Pages/default.aspx
10M tokens using OpenAI’s GPT-4, while only $ 0.2 for GPT- [3] H. Zhou, M. Erol-Kantarci, and V. Poor, “Knowledge transfer and
J hosted by Textsyth [230]. The financing cost of training, reuse: A case study of AI-enabled resource management in RAN
slicing,” IEEE Wireless Communications, vol. 30, no. 5, pp. 160–169,
fine-tuning, and deploying LLMs will significantly affect the Oct. 2022.
application in telecom networks. However, advancements like [4] H. Zhou, M. Elsayed, and M. Erol-Kantarci, “RAN resource slicing
OpenAI’s GPT-4o mini, a cost-efficient small model, offer in 5G using multi-agent correlated Q-learning,” in Proc. of 2021 IEEE
PIMRC, Sep. 2021, pp. 1179–1184.
promising solutions. Priced at just $ 0.15 per million input [5] C. Luo, J. Ji, Q. Wang, X. Chen, and P. Li, “Channel state information
tokens and $ 0.60 per million output tokens, GPT-4o mini prediction for 5G wireless communications: A deep learning approach,”
is more affordable than previous frontier models and over IEEE Trans. on Network Science and Engineering, vol. 7, no. 1, pp.
227–236, 2018.
60% cheaper than GPT-3.5 Turbo. This affordability, combined [6] H. Zhang, H. Zhou, and M. Erol-Kantarci, “Federated deep reinforce-
with its superior performance in reasoning tasks, mathematical ment learning for resource allocation in O-RAN slicing,” in Proc. of
reasoning, and coding proficiency, enables a broad range of IEEE 2022 GLOBECOM Conf., 2022, pp. 958–963.
[7] H. Zhou, M. Erol-Kantarci, Y. Liu, and H. V. Poor, “A survey on model-
cost-effective applications. Similarly, Llama3-8b is also an based, heuristic, and machine learning optimization approaches in RIS-
affordable small-scale model with fast inference speed. These aided wireless networks,” IEEE Communications Surveys & Tutorials
small-scale models may alleviate the economic cost of LLMs (Early access), Dec. 2023.
with fewer parameters, lower training and fine-tuning costs, [8] K. Singhal, T. Tu, J. Gottweis, R. Sayres, E. Wulczyn, L. Hou,
K. Clark, S. Pfohl, H. Cole-Lewis, D. Neal et al., “Towards expert-
and faster inference time. For instance, telecom companies level medical question answering with large language models,” arXiv
can use GPT-4o mini for customer support chatbots that handle preprint arXiv:2305.09617, 2023.
vast conversation histories or for network management systems [9] P. Colombo, T. P. Pires, M. Boudiaf, D. Culver, R. Melo, C. Corro,
A. F. Martins, F. Esposito, V. L. Raposo, S. Morgado et al., “Saullm-
that analyze extensive performance metrics in real time. Given 7b: A pioneering large language model for law,” arXiv preprint
the heterogeneous prices and service quality, it is of great arXiv:2403.03883, 2024.
importance to evaluate the financing cost of deploying LLMs [10] S. Wu, O. Irsoy, S. Lu, V. Dabravolski, M. Dredze, S. Gehrmann,
P. Kambadur, D. Rosenberg, and G. Mann, “Bloomberggpt: A large
in telecom networks, e.g., balancing the possible performance language model for finance,” arXiv preprint arXiv:2303.17564, 2023.
improvement and the LLM deployment cost, and using LLMs [11] W. X. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, et al., “A survey of
in an economic manner for telecom applications. However, large language models,” arXiv:2303.18223, 2023.
[12] J. Wei, L. Hou, A. Lampinen, X. Chen, D. Huang, Y. Tay, X. Chen,
this direction has limited existing studies, and it still requires Y. Lu, D. Zhou, T. Ma et al., “Symbol tuning improves in-context
more research efforts. learning in language models,” arXiv preprint arXiv:2305.08298, 2023.

47
[13] L. Bariah, H. Zou, Q. Zhao, B. Mouhouche, F. Bader, and M. Debbah, enabling techniques, and challenges,” arXiv preprint arXiv:2311.17474,
“Understanding telecom language through large language models,” 2023.
arXiv:2306.07933, 2023. [36] W. Wang, C. Zhou, H. He, W. Wu, W. Zhuang, and X. Shen, “Cellular
[14] Y. Du, S. C. Liew, K. Chen, and Y. Shao, “The power of large language traffic load prediction with LSTM and gaussian process regression,” in
models for wireless communication system development: A case study Proc. of 2020 IEEE Intl. Conf. on communications (ICC), 2020, pp.
on fpga platforms,” arXiv:2307.07319, 2023. 1–6.
[15] Y. Shen, J. Shao, X. Zhang, Z. Lin, H. Pan, D. Li, J. Zhang, and K. B. [37] C. Zhao, H. Du, D. Niyato, J. Kang, Z. Xiong, D. I. Kim, K. B. Letaief
Letaief, “Large language models empowered autonomous edge AI for et al., “Generative AI for Secure Physical Layer Communications: A
connected intelligence,” IEEE Communications Magazine, 2024. Survey,” arXiv preprint arXiv:2402.13553, 2024.
[16] Z. Lin, G. Qu, Q. Chen, X. Chen, Z. Chen, and K. Huang, “Pushing [38] C. Liang, H. Du, Y. Sun, D. Niyato, J. Kang, D. Zhao, and M. A. Imran,
large language models to the 6G edge: Vision, challenges, and oppor- “Generative AI-driven semantic communication networks: Architec-
tunities,” arXiv:2309.16739, 2023. ture, technologies and applications,” arXiv preprint arXiv:2401.00124,
[17] M. Xu, N. Dusit, J. Kang, Z. Xiong, S. Mao, Z. Han, D. I. Kim, and 2023.
K. B. Letaief, “When large language model agents meet 6G networks: [39] R. Zhang, K. Xiong, H. Du, D. Niyato, J. Kang, X. Shen, and
Perception, grounding, and alignment,” arXiv:2401.07764, 2024. H. V. Poor, “Generative AI-enabled vehicular networks: Fundamentals,
[18] Q. Xiang, Y. Lin, M. Fang, B. Huang, S. Huang, R. Wen, F. Le, framework, and case study,” IEEE Network, 2024.
L. Kong, and J. Shu, “Toward reproducing network research results [40] H. Du, D. Niyato, J. Kang, Z. Xiong, P. Zhang, S. Cui, X. Shen,
using large language models,” in Proc. of the 22nd ACM Workshop on S. Mao, Z. Han, A. Jamalipour et al., “The age of generative AI and
Hot Topics in Networks, 2023, pp. 56–62. AI-generated everything,” IEEE Network, 2024.
[19] L. Li, Y. Zhang, and L. Chen, “Prompt distillation for efficient llm- [41] J. Wei, X. Wang, D. Schuurmans, M. Bosma, F. Xia, E. Chi, Q. V.
based recommendation,” in Proceedings of the 32nd ACM International Le, D. Zhou et al., “Chain-of-thought prompting elicits reasoning in
Conference on Information and Knowledge Management, 2023, pp. large language models,” Advances in Neural Information Processing
1348–1357. Systems, vol. 35, pp. 24 824–24 837, 2022.
[20] Z. Xi, W. Chen, X. Guo, W. He, Y. Ding, B. Hong, M. Zhang, J. Wang, [42] J. Song, Z. Zhou, J. Liu, C. Fang, Z. Shu, and L. Ma, “Self-refined
S. Jin, E. Zhou et al., “The rise and potential of large language model large language model as automated reward function designer for deep
based agents: A survey,” arXiv preprint arXiv:2309.07864, 2023. reinforcement learning in robotics,” arXiv:2309.06687, 2023.
[21] C. Yang, X. Wang, Y. Lu, H. Liu, Q. V. Le, D. Zhou, and X. Chen, [43] M. Kwon, S. M. Xie, K. Bullard, and D. Sadigh, “Reward design with
“Large language models as optimizers,” arXiv:2309.03409, 2023. language models,” arXiv:2303.00001, 2023.
[22] D. Wu, X. Wang, Y. Qiao, Z. Wang, J. Jiang, S. Cui, and F. Wang, [44] Y. J. Ma, W. Liang, G. Wang, D.-A. Huang, O. Bastani, D. Jayaraman,
“Large language model adaptation for networking,” arXiv:2402.02338, Y. Zhu, L. Fan, and A. Anandkumar, “Eureka: Human-level reward
2024. design via coding large language models,” arXiv:2310.12931, 2023.
[45] N. Gruver, M. Finzi, S. Qiu, and A. G. Wilson, “Large language models
[23] N. Bosch, “Integrating telecommunications-specific language models
are zero-shot time series forecasters,” Advances in Neural Information
into a trouble report retrieval approach,” Master’s thesis, KTH, School
Processing Systems, vol. 36, 2024.
of Electrical Engineering and Computer Science (EECS), 2022.
[46] Y. Li, Z. Li, K. Zhang, R. Dan, S. Jiang, and Y. Zhang, “Chatdoctor:
[24] Y. Xu, Y. Chen, X. Zhang, X. Lin, P. Hu, Y. Ma, S. Lu, W. Du, Z. Mao,
A medical chat model fine-tuned on a large language model meta-ai
E. Zhai et al., “Cloudeval-yaml: A practical benchmark for cloud
(llama) using medical domain knowledge,” Cureus, vol. 15, no. 6, 2023.
configuration generation,” arXiv preprint arXiv:2401.06786, 2023.
[47] M. Jin, S. Wang, L. Ma, Z. Chu, J. Y. Zhang, X. Shi, P.-Y. Chen,
[25] G. Charan, M. Alrabeiah, and A. Alkhateeb, “Vision-aided 6G wireless
Y. Liang, Y.-F. Li, S. Pan et al., “Time-llm: Time series forecasting by
communications: Blockage prediction and proactive handoff,” IEEE
reprogramming large language models,” arXiv:2310.01728, 2023.
Trans. on Vehicular Technology, vol. 70, no. 10, pp. 10 193–10 208,
[48] Z. Xu, Y. Zhang, E. Xie, Z. Zhao, Y. Guo, K. K. Wong, Z. Li, and
2021.
H. Zhao, “Drivegpt4: Interpretable end-to-end autonomous driving via
[26] M. Matsuura, Y. K. Jung, and S. N. Lim, “Visual-LLM zero-shot large language model,” arXiv preprint arXiv:2310.01412, 2023.
classification,” 2023. [Online]. Available: https://www.crcv.ucf.edu/ [49] J. Zhang, Y. Hou, R. Xie, W. Sun, J. McAuley, W. X. Zhao, L. Lin, and
wp-content/uploads/2018/11/Misaki-Final-report.pdf J.-R. Wen, “Agentcf: Collaborative learning with autonomous language
[27] S. Booth, W. B. Knox, J. Shah, S. Niekum, P. Stone, and A. Allievi, agents for recommender systems,” arXiv preprint arXiv:2310.09233,
“The perils of trial-and-error reward design: misdesign through over- 2023.
fitting and invalid task specifications,” in Proc. of the AAAI Conf. on [50] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N.
Artificial Intelligence, vol. 37, no. 5, 2023, pp. 5920–5929. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”
[28] S. Tarkoma, R. Morabito, and J. Sauvola, “AI-native interconnect Advances in Neural Information Processing Systems, vol. 30, 2017.
framework for integration of large language model technologies in 6G [51] N. Shazeer, “Fast transformer decoding: One write-head is all you
systems,” arXiv:2311.05842, 2023. need,” arXiv:1911.02150, 2019.
[29] Z. Chen, Z. Zhang, and Z. Yang, “Big AI models for 6G wire- [52] J. Ainslie, J. Lee-Thorp, M. de Jong, Y. Zemlyanskiy, F. Lebrón, and
less networks: Opportunities, challenges, and research directions,” S. Sanghai, “Gqa: Training generalized multi-query transformer models
arXiv:2308.06250, 2023. from multi-head checkpoints,” 2023.
[30] L. Bariah, Q. Zhao, H. Zou, Y. Tian, F. Bader, and M. Debbah, “Large [53] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-
language models for telecom: The next big thing?” arXiv:2306.10249, training of deep bidirectional transformers for language understanding,”
2023. arXiv:1810.04805, 2018.
[31] S. Xu, C. K. Thomas, O. Hashash, N. Muralidhar, W. Saad, and [54] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, O. Levy, M. Lewis,
N. Ramakrishnan, “Large multi-modal models (LMMs) as universal L. Zettlemoyer, and V. Stoyanov, “Roberta: A robustly optimized bert
foundation models for AI-native wireless systems,” arXiv:2402.01748, pretraining approach,” arXiv:1907.11692, 2019.
2024. [55] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, and R. Soricut,
[32] S. Javaid, R. A. Khalil, N. Saeed, B. He, and M.-S. Alouini, “Leverag- “Albert: A lite bert for self-supervised learning of language represen-
ing Large Language Models for Integrated Satellite-Aerial-Terrestrial tations,” arXiv:1909.11942, 2019.
Networks: Recent Advances and Future Directions,” arXiv preprint [56] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena,
arXiv:2407.04581, 2024. Y. Zhou, W. Li, and P. J. Liu, “Exploring the limits of transfer learning
[33] A. Maatouk, N. Piovesan, F. Ayed, A. De Domenico, and M. Debbah, with a unified text-to-text transformer,” Journal of machine learning
“Large language models for telecom: Forthcoming impact on the research, vol. 21, no. 140, pp. 1–67, 2020.
industry,” IEEE Communications Magazine, 2024. [57] M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy,
[34] J. Shao, J. Tong, Q. Wu, W. Guo, Z. Li, Z. Lin, and J. Zhang, V. Stoyanov, and L. Zettlemoyer, “Bart: Denoising sequence-to-
“WirelessLLM: Empowering Large Language Models Towards Wire- sequence pre-training for natural language generation, translation, and
less Intelligence,” arXiv preprint arXiv:2405.17053, 2024. comprehension,” arXiv:1910.13461, 2019.
[35] Y. Huang, H. Du, X. Zhang, D. Niyato, J. Kang, Z. Xiong, S. Wang, [58] T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Kaplan, P. Dhariwal,
and T. Huang, “Large language models for networking: Applications, A. Neelakantan, P. Shyam, G. Sastry, A. Askell et al., “Language mod-

48
els are few-shot learners,” Advances in Neural Information Processing Scaling Reinforcement Learning from Human Feedback with AI
Systems, vol. 33, pp. 1877–1901, 2020. Feedback,” 2023. [Online]. Available: https://arxiv.org/abs/2309.00267
[59] A. Chowdhery, S. Narang, J. Devlin, M. Bosma, G. Mishra, A. Roberts, [81] A. Madaan, N. Tandon, P. Gupta, S. Hallinan, L. Gao, S. Wiegreffe,
P. Barham, H. W. Chung, C. Sutton, S. Gehrmann et al., “Palm: Scaling U. Alon, N. Dziri, S. Prabhumoye, Y. Yang et al., “Self-refine: Iter-
language modeling with pathways,” Journal of Machine Learning ative refinement with self-feedback,” Advances in Neural Information
Research, vol. 24, no. 240, pp. 1–113, 2023. Processing Systems, vol. 36, 2024.
[60] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M.-A. Lachaux, [82] J. Wei, X. Wang, D. Schuurmans, M. Bosma et al., “Chain-of-thought
T. Lacroix, B. Rozière, N. Goyal, E. Hambro, F. Azhar et al., “Llama: prompting elicits reasoning in large language models,” Advances in
Open and efficient foundation language models,” arXiv:2302.13971, Neural Information Processing Systems, vol. 35, pp. 24 824–24 837,
2023. 2022.
[61] A. Zeng, X. Liu, Z. Du, Z. Wang, H. Lai, M. Ding, Z. Yang, Y. Xu, [83] J. Liu, D. Shen, Y. Zhang, B. Dolan, L. Carin, and W. Chen,
W. Zheng, X. Xia et al., “Glm-130b: An open bilingual pre-trained “What makes good in-context examples for gpt-3?” arXiv preprint
model,” arXiv:2210.02414, 2022. arXiv:2101.06804, 2021.
[62] M. Guo, Z. Dai, D. Vrandečić, and R. Al-Rfou, “Wiki-40b: Multilin- [84] Y. Lu, M. Bartolo, A. Moore, S. Riedel, and P. Stenetorp, “Fantastically
gual language model dataset,” in Proceedings of the 12th Language ordered prompts and where to find them: Overcoming few-shot prompt
Resources and Evaluation Conference, 2020, pp. 2440–2452. order sensitivity,” arXiv preprint arXiv:2104.08786, 2021.
[63] I. Beltagy, K. Lo, and A. Cohan, “SciBERT: A pretrained language [85] N. Wies, Y. Levine, and A. Shashua, “The learnability of in-context
model for scientific text,” arXiv:1903.10676, 2019. learning,” Advances in Neural Information Processing Systems, vol. 36,
[64] F. F. Xu, U. Alon, G. Neubig, and V. J. Hellendoorn, “A systematic 2024.
evaluation of large language models of code,” in Proc. of the 6th ACM [86] J. Von Oswald, E. Niklasson, E. Randazzo, J. Sacramento, A. Mord-
SIGPLAN Intl. Symposium on Machine Programming, 2022. vintsev, A. Zhmoginov, and M. Vladymyrov, “Transformers learn in-
[65] H. Laurençon, L. Saulnier, T. Wang, C. Akiki et al., “The bigscience context by gradient descent,” in Proc. of the 40th ICML, 2023, pp.
roots corpus: A 1.6 TB composite multilingual dataset,” Advances in 35 151–35 174.
Neural Information Processing Systems, vol. 35, pp. 31 809–31 826, [87] J. Wei, J. Wei, Y. Tay, D. Tran, A. Webson, Y. Lu, X. Chen, H. Liu,
2022. D. Huang, D. Zhou et al., “Larger language models do in-context
[66] T. Kudo, “Subword regularization: Improving neural network trans- learning differently,” arXiv:2303.03846, 2023.
lation models with multiple subword candidates,” arXiv:1804.10959,
[88] S. Yao, D. Yu, J. Zhao, I. Shafran, T. Griffiths, Y. Cao, and
2018.
K. Narasimhan, “Tree of thoughts: Deliberate problem solving with
[67] Z. Li, S. Zhuang, S. Guo, D. Zhuo, H. Zhang, D. Song, and I. Stoica, large language models,” Advances in Neural Information Processing
“Terapipe: Token-level pipeline parallelism for training large-scale Systems, vol. 36, 2024.
language models,” in Proc. of 38th Intl. Conf. on Machine Learning
[89] J. Qian, H. Wang, Z. Li, S. Li, and X. Yan, “Limitations of lan-
(ICML), 2021, pp. 6543–6552.
guage models in arithmetic and symbolic induction,” arXiv preprint
[68] Y. Huang, Y. Cheng, A. Bapna, O. Firat, D. Chen, M. Chen, H. Lee, arXiv:2208.05051, 2022.
J. Ngiam, Q. V. Le, Y. Wu et al., “Gpipe: Efficient training of
giant neural networks using pipeline parallelism,” Advances in Neural [90] D. Zhou, N. Schärli, L. Hou, J. Wei, N. Scales, X. Wang, D. Schu-
Information Processing Systems, vol. 32, 2019. urmans, C. Cui, O. Bousquet, Q. Le et al., “Least-to-most prompting
enables complex reasoning in large language models,” arXiv preprint
[69] M. Shoeybi, M. Patwary, R. Puri, P. LeGresley, J. Casper, and B. Catan-
arXiv:2205.10625, 2022.
zaro, “Megatron-lm: Training multi-billion parameter language models
using model parallelism,” arXiv:1909.08053, 2019. [91] L. Wang, W. Xu, Y. Lan, Z. Hu, Y. Lan, R. K.-W. Lee, and E.-P.
[70] S. Rajbhandari, J. Rasley, O. Ruwase, and Y. He, “Zero: Memory Lim, “Plan-and-solve prompting: Improving zero-shot chain-of-thought
optimizations toward training trillion parameter models,” in Proc. reasoning by large language models,” arXiv preprint arXiv:2305.04091,
of SC20: Intl. Conf. for High Performance Computing, Networking, 2023.
Storage and Analysis, 2020, pp. 1–16. [92] C. Hu, H. Zhou, D. Wu, X. Chen, J. Yan, and X. Liu, “Trafficllm: Self-
[71] J. Wei, M. Bosma, V. Y. Zhao et al., “Finetuned language models are refined large language models for communication traffic prediction,”
zero-shot learners,” arXiv preprint arXiv:2109.01652, 2021. IEEE Transactions on Vehicular Technology, under review, 2024.
[72] L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright, P. Mishkin, [93] C.-Y. Lin, “Rouge: A package for automatic evaluation of summaries,”
C. Zhang, S. Agarwal, K. Slama, A. Ray et al., “Training language in Text summarization branches out, 2004, pp. 74–81.
models to follow instructions with human feedback,” Advances in [94] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, and Y. Artzi,
Neural Information Processing Systems, vol. 35, pp. 27 730–27 744, “Bertscore: Evaluating text generation with bert,” arXiv preprint
2022. arXiv:1904.09675, 2019.
[73] OpenAI, “Gpt-4v(ision) system card,” OpenAI, 2023. [95] M. Gao and X. Wan, “Dialsummeval: Revisiting summarization evalu-
[74] H. W. Chung, L. Hou, S. Longpre, B. Zoph, Y. Tay, W. Fedus, Y. Li, ation for dialogues,” in Proc. of the 2022 Conf. of the North American
X. Wang, M. Dehghani, S. Brahma et al., “Scaling instruction-finetuned Chapter of the Association for Computational Linguistics: Human
language models,” arXiv:2210.11416, 2022. Language Technologies, 2022, pp. 5693–5709.
[75] D. M. Ziegler, N. Stiennon, J. Wu, T. B. Brown, A. Radford, [96] Y. Zha, Y. Yang, R. Li, and Z. Hu, “Alignscore: Evaluating fac-
D. Amodei, P. Christiano, and G. Irving, “Fine-tuning language models tual consistency with a unified alignment function,” arXiv preprint
from human preferences,” arXiv preprint arXiv:1909.08593, 2019. arXiv:2305.16739, 2023.
[76] P. F. Christiano, J. Leike, T. Brown, M. Martic, S. Legg, and [97] S. Samsi, D. Zhao, J. McDonald, B. Li, A. Michaleas, M. Jones,
D. Amodei, “Deep reinforcement learning from human preferences,” W. Bergeron, J. Kepner, D. Tiwari, and V. Gadepally, “From words to
Advances in Neural Information Processing Systems, vol. 30, 2017. watts: Benchmarking the energy costs of large language model infer-
[77] C. Zhou, P. Liu, P. Xu, S. Iyer, J. Sun, Y. Mao, X. Ma, A. Efrat, P. Yu, ence,” in 2023 IEEE High Performance Extreme Computing Conference
L. Yu et al., “Lima: Less is more for alignment,” arXiv:2305.11206, (HPEC). IEEE, 2023, pp. 1–9.
2023. [98] A. Faiz, S. Kaneda, R. Wang, R. Osi, P. Sharma, F. Chen, and
[78] R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, L. Jiang, “Llmcarbon: Modeling the end-to-end carbon footprint of
and C. Finn, “Direct preference optimization: Your language large language models,” arXiv preprint arXiv:2309.14393, 2023.
model is secretly a reward model,” 2023. [Online]. Available: [99] J. McDonald, B. Li, N. Frey, D. Tiwari, V. Gadepally, and S. Samsi,
https://arxiv.org/abs/2305.18290 “Great power, great responsibility: Recommendations for reducing en-
[79] Y. Bai, S. Kadavath, S. Kundu, A. Askell, J. Kernion, A. Jones, ergy for training language models,” arXiv preprint arXiv:2205.09646,
A. Chen, A. Goldie, A. Mirhoseini, C. McKinnon et al., “Constitutional 2022.
AI: Harmlessness from AI feedback,” arXiv preprint arXiv:2212.08073, [100] Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi,
2022. C. Wang, Y. Wang et al., “A survey on evaluation of large language
[80] H. Lee, S. Phatale, H. Mansoor, T. Mesnard, J. Ferret, K. Lu, models,” ACM Transactions on Intelligent Systems and Technology,
C. Bishop, E. Hall, V. Carbune, A. Rastogi, and S. Prakash, “RLAIF: 2023.

49
[101] C. Ziems, W. Held, O. Shaikh, J. Chen, Z. Zhang, and D. Yang, in Proc. of 2023 19th Intl. Conf. on Network and Service Management
“Can large language models transform computational social science?” (CNSM), 2023, pp. 1–7.
Computational Linguistics, pp. 1–55, 2024. [123] C. Wang, M. Scazzariello, A. Farshin, D. Kostic, and M. Chiesa,
[102] P. Liang, R. Bommasani, T. Lee, D. Tsipras, D. Soylu, M. Yasunaga, “Making network configuration human friendly,” arXiv preprint
Y. Zhang, D. Narayanan, Y. Wu, A. Kumar et al., “Holistic evaluation arXiv:2309.06342, 2023.
of language models,” arXiv preprint arXiv:2211.09110, 2022. [124] R. Mondal, A. Tang, R. Beckett, T. Millstein, and G. Varghese, “What
[103] N. Ding, Y. Qin, G. Yang, F. Wei, Z. Yang, Y. Su, S. Hu, Y. Chen, C.- do LLMs need to synthesize correct router configurations?” in Proc.
M. Chan, W. Chen et al., “Parameter-efficient fine-tuning of large-scale of the 22nd ACM Workshop on Hot Topics in Networks, 2023, pp.
pre-trained language models,” Nature Machine Intelligence, vol. 5, 189–195.
no. 3, pp. 220–235, 2023. [125] A. Maatouk, F. Ayed, N. Piovesan, A. De Domenico, M. Debbah, and
[104] Z. Lin, G. Zhu, Y. Deng, X. Chen, Y. Gao, K. Huang, and Y. Fang, Z.-Q. Luo, “Teleqna: A benchmark dataset to assess large language
“Efficient parallel split learning over resource-constrained wireless models telecommunications knowledge,” arXiv:2310.15051, 2023.
edge networks,” IEEE Trans. on Mobile Computing, 2024. [126] E. Ibarrola, K. Jakobs, M. H. Sherif, and D. Sparrell, “The evolution
[105] T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer, “Qlora: Ef- of telecom business, economy, policies and regulations,” IEEE Com-
ficient finetuning of quantized LLMs,” Advances in Neural Information munications Magazine, vol. 61, no. 7, pp. 16–17, 2023.
Processing Systems, vol. 36, 2024. [127] Y. Gu, R. Tinn, H. Cheng, M. Lucas, N. Usuyama, X. Liu, T. Naumann,
[106] K. Alizadeh, I. Mirzadeh, D. Belenko, K. Khatamifard, M. Cho, C. C. J. Gao, and H. Poon, “Domain-specific language model pretraining for
Del Mundo, M. Rastegari, and M. Farajtabar, “Llm in a flash: Efficient biomedical natural language processing,” ACM Trans. on Computing
large language model inference with limited memory,” arXiv preprint for Healthcare (HEALTH), vol. 3, no. 1, pp. 1–23, 2021.
arXiv:2312.11514, 2023. [128] P. Bajaj, D. Campos, N. Craswell, L. Deng, J. Gao, X. Liu,
[107] Qualcomm, “Qualcomm brings the best of on-device ai to R. Majumder, A. McNamara, B. Mitra, T. Nguyen et al., “MS
more smartphones with snapdragon 8s gen 3,” 2024. [On- MARCO: A human generated machine reading comprehension
line]. Available: https://www.qualcomm.com/news/releases/2024/03/ dataset,” arXiv:1611.09268, 2016.
qualcomm-brings-the-best-of-on-device-ai-to-more-smartphones-wit [129] E. Nijkamp, B. Pang, H. Hayashi, L. Tu, H. Wang, Y. Zhou,
[108] B. Yang, L. He, N. Ling, Z. Yan, G. Xing, X. Shuai, X. Ren, and S. Savarese, and C. Xiong, “Codegen: An open large language model
X. Jiang, “Edgefm: Leveraging foundation model for open-set learning for code with multi-turn program synthesis,” arXiv:2203.13474, 2022.
on the edge,” arXiv preprint arXiv:2311.10986, 2023. [130] G. Lacerda, F. Petrillo, M. Pimenta, and Y. G. Guéhéneuc, “Code
[109] B. Peng, C. Li, P. He, M. Galley, and J. Gao, “Instruction tuning with smells and refactoring: A tertiary systematic review of challenges and
gpt-4,” arXiv:2304.03277, 2023. observations,” Journal of Systems and Software, vol. 167, p. 110610,
[110] S. Zhang, L. Dong, X. Li, S. Zhang, X. Sun, S. Wang, J. Li, R. Hu, 2020.
T. Zhang, F. Wu et al., “Instruction tuning for large language models: [131] X. Jiao, W. Liu, M. Mehari, M. Aslam, and I. Moerman, “openwifi:
A survey,” arXiv:2308.10792, 2023. a free and open-source IEEE802.11 SDR implementation on SoC,” in
Proc. 2020 IEEE 91st Vehicular Technology Conf. (VTC2020-Spring),
[111] Y. Wang, K. Chen, H. Tan, and K. Guo, “Tabi: An efficient multi-level
2020, pp. 1–2.
inference system for large language models,” in Proc. 18th European
[132] B. A. A. Nunes, M. Mendonca, X. N. Nguyen, K. Obraczka, and
Conf. on Computer Systems, 2023, pp. 233–248.
T. Turletti, “A survey of software-defined networking: Past, present,
[112] D. Xu, W. Yin, X. Jin, Y. Zhang, S. Wei, M. Xu, and X. Liu, “LLM-
and future of programmable networks,” IEEE Commun. Surv. Tutorials,
Cad: Fast and scalable on-device large language model inference,”
vol. 16, no. 3, pp. 1617–1634, 2014.
arXiv:2309.04255, 2023.
[133] A. El-Hassany, P. Tsankov, L. Vanbever, and M. T. Vechev, “Netcom-
[113] Y. Chen, R. Li, Z. Zhao, C. Peng, J. Wu, E. Hossain, and H. Zhang, plete: Practical network-wide configuration synthesis with autocom-
“Netgpt: An ai-native network architecture for provisioning beyond pletion,” in Proc. of 15th USENIX Symposium on Networked Systems
personalized generative services,” IEEE Network, 2024. Design and Implementation, 2018, pp. 579–594.
[114] H. Zhou, M. Elsayed, M. Bavand, R. Gaigalas, S. Furr, and M. Erol- [134] H. Chen, Y. Jin, W. Wang, W. Liu, L. You, L. Fu, and Q. Xiang, “When
Kantarci, “Cooperative hierarchical deep reinforcement learning based configuration verification meets machine learning: A DRL approach for
joint sleep, power, and ris control for energy-efficient hetnet,” finding minimum k-link failures,” in Proc. of 24st Asia-Pacific Network
arXiv:2304.13226, 2023. Operations and Management Symposium, 2023, pp. 83–88.
[115] H. Holm, “Bidirectional encoder representations from transformers [135] A. Fogel, S. Fung, L. Pedrosa, M. Walraed-Sullivan, R. Govindan,
(bert) for question answering in the telecom domain,” Master’s thesis, R. Mahajan, and T. D. Millstein, “A general approach to network con-
KTH, School of Electrical Engineering and Computer Science (EECS), figuration analysis,” in Proc. 12th USENIX Symposium on Networked
2021. Systems Design and Implementation, 2015, pp. 469–483.
[116] N. Marzo i Grimalt, “Natural language processing model for log [136] E. Aghaei, X. Niu, W. Shadid, and E. Al-Shaer, “Securebert: A domain-
analysis to retrieve solutions for troubleshooting processes,” Master’s specific language model for cybersecurity,” in Proc. of Intl. Conf. on
thesis, KTH, School of Electrical Engineering and Computer Science Security and Privacy in Communication Systems, 2022, pp. 39–56.
(EECS), 2021. [137] K. Ameri, M. Hempel, H. Sharif, J. Lopez Jr, and K. Perumalla,
[117] S. Soman and R. HG, “Observations on LLMs for telecom domain: “Cybert: Cybersecurity claim classification by fine-tuning the bert
Capabilities and limitations,” arXiv:2305.13102, 2023. language model,” Journal of Cybersecurity and Privacy, vol. 1, no. 4,
[118] B. Wang, Z. Wang, X. Wang, Y. Cao, R. A Saurous, and Y. Kim, pp. 615–637, 2021.
“Grammar prompting for domain-specific language generation with [138] J. Yin, M. Tang, J. Cao, and H. Wang, “Apply transfer learning to cy-
large language models,” Advances in Neural Information Processing bersecurity: Predicting exploitability of vulnerabilities by description,”
Systems, vol. 36, 2024. Knowledge-Based Systems, vol. 210, p. 106529, 2020.
[119] S. K. Mani, Y. Zhou, K. Hsieh, S. Segarra, T. Eberl, E. Azulai, I. Frizler, [139] M. A. Ferrag, M. Ndhlovu, N. Tihanyi, L. C. Cordeiro, M. Deb-
R. Chandra, and S. Kandula, “Enhancing network management using bah, T. Lestable, and N. S. Thandi, “Revolutionizing cyber threat
code generated by large language models,” in Proc. of the 22nd ACM detection with large language models: A privacy-preserving bert-based
Workshop on Hot Topics in Networks, 2023, pp. 196–204. lightweight model for iot/iiot devices,” IEEE Access, 2024.
[120] J. Zhang, J. Cambronero, S. Gulwani, V. Le, R. Piskac, G. Soares, and [140] Y. E. Seyyar, A. G. Yavuz, and H. M. Ünver, “An attack detection
G. Verbruggen, “Repairing bugs in Python assignments using large framework based on bert and deep learning,” IEEE Access, vol. 10,
language models,” arXiv:2209.14876, 2022. pp. 68 633–68 644, 2022.
[121] S. Thakur, B. Ahmad, Z. Fan, H. Pearce, B. Tan, R. Karri, B. Dolan- [141] S. Aftan and H. Shah, “Using the AraBERT model for customer
Gavitt, and S. Garg, “Benchmarking large language models for au- satisfaction classification of telecom sectors in saudi arabia,” Brain
tomated verilog RTL code generation,” in Proc. of 2023 Design, Sciences, vol. 13, no. 1, p. 147, 2023.
Automation & Test in Europe Conf. & Exhibition (DATE), 2023, pp. [142] S. Terra Vieira, R. Lopes Rosa, D. Zegarra Rodrı́guez, M. Ar-
1–6. jona Ramı́rez, M. Saadi, and L. Wuttisittikulkij, “Q-meter: Quality
[122] K. Dzeparoska, J. Lin, A. Tizghadam, and A. Leon-Garcia, “LLM- monitoring system for telecommunication services based on sentiment
based policy generation for intent-based management of applications,” analysis using deep learning,” Sensors, vol. 21, no. 5, p. 1880, 2021.

50
[143] Y. Yao, H. Zhou, and M. Erol-Kantarci, “Joint sensing and communica- [164] Z. Yang, L. Li, K. Lin, J. Wang, C.-C. Lin, Z. Liu, and L. Wang, “The
tions for deep reinforcement learning-based beam management in 6G,” dawn of LLM: Preliminary explorations with GPT-4v (ision),” arXiv
in Proc. IEEE 2022 GLOBECOM Conf., Dec 2022, pp. 5019–5024. preprint arXiv:2309.17421, vol. 9, no. 1, p. 1, 2023.
[144] S. Pratt, I. Covert, R. Liu, and A. Farhadi, “What does a platypus [165] T. Bujlow, V. Carela-Español, and P. Barlet-Ros, “Independent compar-
look like? generating customized prompts for zero-shot image classi- ison of popular dpi tools for traffic classification,” Computer Networks,
fication,” in Proc. of the IEEE/CVF Intl. Conf. on Computer Vision, vol. 76, pp. 75–89, 2015.
2023, pp. 15 691–15 701. [166] K. Lin, X. Xu, and H. Gao, “Tscrnn: A novel classification scheme
[145] Z. Shi, N. Luktarhan, Y. Song, and G. Tian, “BFCN: a novel classifica- of encrypted traffic based on flow spatiotemporal features for efficient
tion method of encrypted traffic based on BERT and CNN,” Electronics, management of iiot,” Computer Networks, vol. 190, p. 107974, 2021.
vol. 12, no. 3, p. 516, 2023. [167] P. Sirinam, M. Imani, M. Juarez, and M. Wright, “Deep fingerprinting:
[146] X. Lin, G. Xiong, G. Gou, Z. Li, J. Shi, and J. Yu, “Et-bert: A Undermining website fingerprinting defenses with deep learning,” in
contextualized datagram representation with pre-training transformers Proc. of the 2018 ACM SIGSAC Conf. on Computer and Communica-
for encrypted traffic classification,” in Proc. of 2022 ACM Web Conf., tions Security, 2018, pp. 1928–1943.
2022, pp. 633–642. [168] K. Shen and W. Yu, “Fractional programming for communication
[147] T. Van Ede, R. Bortolameotti, A. Continella, J. Ren, D. J. Dubois, systems—Part I: Power control and beamforming,” IEEE Trans. on
M. Lindorfer, D. Choffnes, M. Van Steen, and A. Peter, “Flowprint: Signal Processing, vol. 66, no. 10, pp. 2616–2630, 2018.
Semi-supervised mobile-app fingerprinting on encrypted network traf- [169] S. L. Martins and C. C. Ribeiro, “Metaheuristics and applications to
fic,” in Proc. of Network and distributed system security symposium optimization problems in telecommunications,” Handbook of optimiza-
(NDSS), vol. 27, 2020. tion in telecommunications, pp. 103–128, 2006.
[148] G. Draper-Gil, A. H. Lashkari, M. S. I. Mamun, and A. A. Ghorbani, [170] S. Alarie, C. Audet, A. E. Gheribi, M. Kokkolaras, and S. Le Digabel,
“Characterization of encrypted and vpn traffic using time-related,” in “Two decades of blackbox optimization applications,” EURO Journal
Proc. of the 2nd Intl. Conf. on information systems security and privacy on Computational Optimization, vol. 9, p. 100011, 2021.
(ICISSP), 2016, pp. 407–414. [171] R. Devidze, G. Radanovic, P. Kamalaruban, and A. Singla, “Explicable
[149] A. Radford, K. Narasimhan, T. Salimans, I. Sutskever reward design for reinforcement learning agents,” Advances in Neural
et al., “Improving language understanding by generative pre- Information Processing Systems, vol. 34, pp. 20 118–20 131, 2021.
training,” OpenAI, Tech. Rep., 2018. [Online]. Available: [172] R. Anand, D. Aggarwal, and V. Kumar, “A comparative analysis of
https://www.mikecaptain.com/resources/pdf/GPT-1.pdf optimization solvers,” Journal of Statistics and Management Systems,
[150] K. Mitra, A. Zaslavsky, and C. Åhlund, “Context-aware QoE mod- vol. 20, no. 4, pp. 623–635, 2017.
elling, measurement, and prediction in mobile computing systems,” [173] S. Bubeck, V. Chandrasekaran, R. Eldan, J. Gehrke, E. Horvitz, E. Ka-
IEEE Trans. on Mobile Computing, vol. 14, no. 5, pp. 920–936, 2013. mar, P. Lee, Y. T. Lee, Y. Li, S. Lundberg et al., “Sparks of artificial
[151] J.-B. Alayrac, J. Donahue, P. Luc, A. Miech, I. Barr, Y. Hasson, general intelligence: Early experiments with gpt-4,” arXiv:2303.12712,
K. Lenc, A. Mensch, K. Millican, M. Reynolds et al., “Flamingo: 2023.
a visual language model for few-shot learning,” Advances in Neural
[174] N. Shinn, F. Cassano, A. Gopinath, K. Narasimhan, and S. Yao,
Information Processing Systems, vol. 35, pp. 23 716–23 736, 2022.
“Reflexion: Language agents with verbal reinforcement learning,”
[152] Y. Tewel, Y. Shalev, I. Schwartz, and L. Wolf, “Zerocap: Zero-shot
Advances in Neural Information Processing Systems, vol. 36, 2024.
image-to-text generation for visual-semantic arithmetic,” in Proc. of the
[175] K. Cobbe, V. Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser,
IEEE/CVF Conf. on Computer Vision and Pattern Recognition, 2022,
M. Plappert, J. Tworek, J. Hilton, R. Nakano et al., “Training verifiers
pp. 17 918–17 928.
to solve math word problems,” arXiv:2110.14168, 2021.
[153] Y. Du, F. Wei, Z. Zhang, M. Shi, Y. Gao, and G. Li, “Learning
to prompt for open-vocabulary object detection with vision-language [176] M. Suzgun, N. Scales, N. Schärli, S. Gehrmann, Y. Tay, H. W. Chung,
model,” in Proc. of the IEEE/CVF Conf. on Computer Vision and A. Chowdhery, Q. V. Le, E. H. Chi, D. Zhou et al., “Challeng-
Pattern Recognition, 2022, pp. 14 084–14 093. ing big-bench tasks and whether chain-of-thought can solve them,”
arXiv:2210.09261, 2022.
[154] “Spacy text analytic tool,” https://spacy.io/usage, accessed: 2010-09-30.
[155] M. A. Ferrag, O. Friha, D. Hamouda, L. Maglaras, and H. Janicke, [177] P.-F. Guo, Y.-H. Chen, Y.-D. Tsai, and S.-D. Lin, “Towards optimizing
“Edge-iiotset: A new comprehensive realistic cyber security dataset with large language models,” arXiv:2310.05204, 2023.
of iot and iiot applications: Centralized and federated learning,” 2022. [178] H. Chen, G. E. Constante-Flores, and C. Li, “Diagnosing infeasible op-
[Online]. Available: https://dx.doi.org/10.21227/mbc1-1h68 timization problems using large language models,” arXiv:2308.12923,
[156] T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cis- 2023.
tac, T. Rault, R. Louf, M. Funtowicz et al., “Huggingface’s transform- [179] A. AhmadiTeshnizi, W. Gao, and M. Udell, “OptiMUS: Opti-
ers: State-of-the-art natural language processing,” arXiv:1910.03771, mization modeling using MIP solvers and large language models,”
2019. arXiv:2310.06116, 2023.
[157] E. Aghaei and E. Al-Shaer, “Threatzoom: neural network for automated [180] M. Pluhacek, A. Kazikova, T. Kadavy, A. Viktorin, and R. Senkerik,
vulnerability mitigation,” in Proc. of the 6th Annual Symposium on Hot “Leveraging large language models for the generation of novel meta-
Topics in the Science of Security, 2019, pp. 1–3. heuristic optimization algorithms,” in Proc. of the Companion Conf. on
[158] E. Aghaei, W. Shadid, and E. Al-Shaer, “Threatzoom: Hierarchical Genetic and Evolutionary Computation, 2023, pp. 1812–1820.
neural network for cves to cwes classification,” in Proc. of Intl. Conf. [181] F. Liu, X. Lin, Z. Wang, S. Yao, X. Tong, M. Yuan, and Q. Zhang,
on Security and Privacy in Communication Systems, 2020, pp. 23–41. “Large language model for multi-objective evolutionary optimization,”
[159] F. Yayah, K. I. Ghauth, and C. TING, “The automated machine learning arXiv:2310.12541, 2023.
classification approach on telco trouble ticket dataset,” Journal of [182] Y. Xian, C. H. Lampert, B. Schiele, and Z. Akata, “Zero-shot learn-
Engineering Science and Technology, vol. 16, pp. 4263–4282, 2021. ing—a comprehensive evaluation of the good, the bad and the ugly,”
[160] C. Shah, R. W. White, R. Andersen, G. Buscher, S. Counts, S. S. S. IEEE Trans. on pattern analysis and machine intelligence, vol. 41,
Das, A. Montazer, S. Manivannan, J. Neville, X. Ni et al., “Using no. 9, pp. 2251–2265, 2018.
large language models to generate, validate, and apply user intent [183] H. Zhou, M. Erol-Kantarci, and H. V. Poor, “Learning from peers: Deep
taxonomies,” arXiv:2309.13063, 2023. transfer reinforcement learning for joint radio and cache resource allo-
[161] Y. Ahn, J. Kim, S. Kim, K. Shim, J. Kim, S. Kim, and B. Shim, cation in 5G RAN slicing,” IEEE Trans. on Cognitive Communications
“Towards intelligent millimeter and terahertz communication for 6G: and Networking, vol. 8, no. 4, pp. 1925–1941, 2022.
Computer vision-aided beamforming,” IEEE Wireless Communications, [184] D. Hadfield-Menell, S. Milli, P. Abbeel, S. J. Russell, and A. Dragan,
2022. “Inverse reward design,” Advances in Neural Information Processing
[162] M. Civelek and A. Yazici, “Automated moving object classification in Systems, vol. 30, 2017.
wireless multimedia sensor networks,” IEEE Sensors Journal, vol. 17, [185] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction.
no. 4, pp. 1116–1131, 2016. MIT press, 2018.
[163] S.-W. Kim, K. Ko, H. Ko, and V. C. Leung, “Edge-network-assisted [186] S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and
real-time object detection framework for autonomous driving,” IEEE Y. Cao, “React: Synergizing reasoning and acting in language models,”
Network, vol. 35, no. 1, pp. 177–183, 2021. arXiv:2210.03629, 2022.

51
[187] D. Golovin, B. Solnik, S. Moitra, G. Kochanski, J. Karro, and D. Scul- [209] Y. Zhang, K. Gong, K. Zhang, H. Li, Y. Qiao, W. Ouyang, and X. Yue,
ley, “Google vizier: A service for black-box optimization,” in Proc. of “Meta-transformer: A unified framework for multimodal learning,”
the 23rd ACM SIGKDD Intl. Conf. on knowledge discovery and data arXiv:2307.10802, 2023.
mining, 2017, pp. 1487–1495. [210] W. Jiang and H. D. Schotten, “Deep learning for fading channel
[188] V. N. Ha and L. B. Le, “Distributed base station association and power prediction,” IEEE Open Journal of the Communications Society, vol. 1,
control for heterogeneous cellular networks,” IEEE Trans. on Vehicular pp. 320–332, 2020.
Technology, vol. 63, no. 1, pp. 282–296, 2013. [211] P. Sen, J. Hall, M. Polese, V. Petrov, D. Bodet, F. Restuccia, T. Melodia,
[189] W. Zhang, Z. Zhang, H.-C. Chao, and M. Guizani, “Toward intelligent and J. M. Jornet, “Terahertz communications can work in rain and
network optimization in wireless networking: An auto-learning frame- snow: Impact of adverse weather conditions on channels at 140 GHz,”
work,” IEEE Wireless Communications, vol. 26, no. 3, pp. 76–82, 2019. in Proc. of the 6th ACM Workshop on Millimeter-Wave and Terahertz
[190] R. M. Dreifuerst, S. Daulton, Y. Qian, P. Varkey, M. Balandat, Networks and Sensing Systems, 2022, pp. 13–18.
S. Kasturia, A. Tomar, A. Yazdan, V. Ponnampalam, and R. W. Heath, [212] Y. Ke, H. Gao, W. Xu, L. Li, L. Guo, and Z. Feng, “Position prediction
“Optimizing coverage and capacity in cellular networks using machine based fast beam tracking scheme for multi-user UAV-mmWave com-
learning,” in Proc. of 2021 IEEE Intl. Conf. on Acoustics, Speech and munications,” in Proc. of 2019 IEEE Intl. Conf. on Communications
Signal Processing (ICASSP), 2021, pp. 8138–8142. (ICC), 2019, pp. 1–7.
[191] S. Diamond and S. Boyd, “CVXPY: A Python-embedded modeling [213] S. H. A. Shah and S. Rangan, “Multi-cell multi-beam prediction using
language for convex optimization,” The Journal of Machine Learning auto-encoder LSTM for mmwave systems,” IEEE Trans. on Wireless
Communications, vol. 21, no. 12, pp. 10 366–10 380, 2022.
Research, vol. 17, no. 1, pp. 2909–2913, 2016.
[214] D. Alekseeva, N. Stepanov, A. Veprev, A. Sharapova, E. S. Lohan, and
[192] H. Zhou, M. Erol-Kantarci, Y. Liu, and H. V. Poor, “Heuristic algo- A. Ometov, “Comparison of machine learning techniques applied to
rithms for RIS-assisted wireless networks: Exploring heuristic-aided traffic prediction of real wireless network,” IEEE Access, vol. 9, pp.
machine learning,” arXiv:2307.01205, 2023. 159 495–159 514, 2021.
[193] J. Dai, Y. Wang, C. Pan, K. Zhi, H. Ren, and K. Wang, “Reconfigurable [215] C. Hu, X. Chen, J. Wang, H. Li, J. Kang, Y. T. Xu, X. Liu, D. Wu,
intelligent surface aided massive MIMO systems with low-resolution S. Jang, I. Park, and G. Dudek, “AFB: Improving communication load
DACs,” IEEE Communications Letters, vol. 25, no. 9, pp. 3124–3128, forecasting accuracy with adaptive feature boosting,” in GLOBECOM
2021. 2021 - IEEE Global Communications Conference, 2021, pp. 01–06.
[194] K. Zhi, C. Pan, H. Ren, and K. Wang, “Power scaling law analysis [216] M. Abdullah, J. He, and K. Wang, “Weather-aware fiber-wireless traffic
and phase shift optimization of RIS-aided massive MIMO systems with prediction using graph convolutional networks,” IEEE Access, vol. 10,
statistical CSI,” IEEE Trans. on Communications, vol. 70, no. 5, pp. pp. 95 908–95 918, 2022.
3558–3574, 2022. [217] C. Liang, Y. He, F. R. Yu, and N. Zhao, “Enhancing QoE-aware
[195] T. Kojima, S. S. Gu, M. Reid, Y. Matsuo, and Y. Iwasawa, “Large lan- wireless edge caching with software-defined wireless networks,” IEEE
guage models are zero-shot reasoners,” Advances in Neural Information Trans. on Wireless Communications, vol. 16, no. 10, pp. 6912–6925,
Processing Systems, vol. 35, pp. 22 199–22 213, 2022. 2017.
[196] A. Garza and M. Mergenthaler-Canseco, “Timegpt-1,” [218] I. Sousa, M. P. Queluz, and A. Rodrigues, “A survey on QoE-oriented
arXiv:2310.03589, 2023. wireless resources scheduling,” Journal of Network and Computer
[197] M. Razghandi, H. Zhou, M. Erol-Kantarci, and D. Turgut, “Smart Applications, vol. 158, p. 102594, 2020.
home energy management: Vae-gan synthetic dataset generator and q- [219] I. Karim, K. S. Mubasshir, M. M. Rahman, and E. Bertino, “Spec5g:
learning,” IEEE Trans. on Smart Grid, vol. 15, no. 2, pp. 1562–1573, A dataset for 5g cellular network protocol analysis,” arXiv preprint
Mar 2024. arXiv:2301.09201, 2023.
[198] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, [220] R. Nikbakht, M. Benzaghta, and G. Geraci, “Tspec-llm: An open-
T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., source dataset for llm understanding of 3gpp specifications,” arXiv
“An image is worth 16x16 words: Transformers for image recognition preprint arXiv:2406.01768, 2024.
at scale,” arXiv:2010.11929, 2020. [221] Y. Miao, Y. Bai, L. Chen, D. Li, H. Sun, X. Wang, Z. Luo, D. Sun, and
[199] Y. Dong, J.-B. Cordonnier, and A. Loukas, “Attention is not all you X. Xu, “An empirical study of netops capability of pre-trained large
need: Pure attention loses rank doubly exponentially with depth,” 2023. language models,” arXiv preprint arXiv:2309.05557, 2023.
[200] A. Das, W. Kong, R. Sen, and Y. Zhou, “A decoder-only foundation [222] S. Roychowdhury, N. Jain, and S. Soman, “Unlocking telecom domain
knowledge using llms,” in 2024 16th International Conference on
model for time-series forecasting,” arXiv preprint arXiv:2310.10688,
2023. COMmunication Systems & NETworkS (COMSNETS). IEEE, 2024,
pp. 267–269.
[201] K. Kousias, M. Rajiullah, G. Caso, U. Ali, O. Alay, A. Brunstrom, [223] P. Gajjar and V. K. Shah, “Oran-bench-13k: An open source benchmark
L. De Nardis, M. Neri, and M.-G. Di Benedetto, “A Large-Scale for assessing llms in open radio access networks,” arXiv preprint
Dataset of 4G, NB-IoT, and 5G Non-Standalone Network Measure- arXiv:2407.06245, 2024.
ments,” IEEE Communications Magazine, vol. 62, no. 5, pp. 44–49, [224] G. Sahu, O. Vechtomova, and I. H. Laradji, “Enchancing semi-
2024. supervised learning for extractive summarization with an llm-based
[202] D. Raca, D. Leahy, C. J. Sreenan, and J. J. Quinlan, “Beyond pseudolabeler,” arXiv preprint arXiv:2311.09559, 2023.
throughput, the next generation: A 5g dataset with channel and context [225] T. Guo, X. Chen, Y. Wang, R. Chang, S. Pei, N. V. Chawla, O. Wiest,
metrics,” in Proc. of the 11th ACM multimedia systems Conf., 2020, and X. Zhang, “Large language model based multi-agents: A survey
pp. 303–308. of progress and challenges,” arXiv preprint arXiv:2402.01680, 2024.
[203] H. Xue and F. D. Salim, “Promptcast: A new prompt-based learning [226] T. Smith, “First Native LLM Fine-Tuned for the Telecom
paradigm for time series forecasting,” IEEE Trans. on Knowledge and Industry,” 12 2023. [Online]. Available: Https://dzone.com/articles/
Data Engineering (Early Access), pp. 1–14, 2023. first-native-LLM-fine-tuned-for-the-Telco-Industry
[204] B. Lester, R. Al-Rfou, and N. Constant, “The power of scale for [227] Z. Luo, Q. Xie, and S. Ananiadou, “Chatgpt as a factual inconsis-
parameter-efficient prompt tuning,” arXiv:2104.08691, 2021. tency evaluator for abstractive text summarization,” arXiv preprint
[205] C. Chang, W.-Y. Wang, W.-C. Peng, and T.-F. Chen, “LLM4TS: arXiv:2303.15621, 2023.
Aligning pre-trained LLMs as data-efficient time-series forecasters,” [228] Wikipedia, “Gpt-4,” 2024. [Online]. Available: https://en.wikipedia.
arXiv:2308.08469, 2024. org/wiki/GPT-4
[206] T. Zhou, P. Niu, X. Wang, L. Sun, and R. Jin, “One fits all: Power [229] Neoteric, “Cost estimation of using gpt-3 for real
general time series analysis by pretrained lm,” arXiv: 2302.11939, applications,” 2024. [Online]. Available: https://neoteric.eu/blog/
2023. how-much-does-it-cost-to-use-gpt-models-gpt-3-pricing-explained/
[207] E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, [230] L. Chen, M. Zaharia, and J. Zou, “Frugalgpt: How to use large
and W. Chen, “Lora: Low-rank adaptation of large language models,” language models while reducing cost and improving performance,”
arXiv: 2106.09685, 2021. arXiv preprint arXiv:2305.05176, 2023.
[208] K. Lu, A. Grover, P. Abbeel, and I. Mordatch, “Pretrained transformers
as universal computation engines,” arXiv: 2103.05247, 2021.

52

You might also like