2 - Deep Reinforcement Learning Based Rate Adaptation For Wi-Fi Networks

Uploaded by

najme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views5 pages

2 - Deep Reinforcement Learning Based Rate Adaptation For Wi-Fi Networks

Uploaded by

najme

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Deep Reinforcement Learning based Rate

Adaptation for Wi-Fi Networks

Wenhai Lin∗ , Ziyang Guo† , Peng Liu† , Mingjun Du∗ , Xinghua Sun∗ and Xun Yang†
2022 IEEE 96th Vehicular Technology Conference (VTC2022-Fall) | 978-1-6654-5468-1/22/$31.00 ©2022 IEEE | DOI: 10.1109/VTC2022-Fall57202.2022.10012797

∗ School of Electronics and Communication Engineering, Sun Yat-sen University, Guangzhou, China
† Wireless Technology Lab, 2012 Laboratories, Huawei Technologies Co., Ltd, China

Email: [email protected], [email protected], [email protected]

Abstract—The rate adaptation (RA) algorithm, which adap- The estimated SNR is shown to be an unreliable measure
tively selects the rate according to the quality of the wireless since it can be easily affected by the severe interference. As
environment, is one of the cornerstones of the wireless systems. a result, most mainstream Wi-Fi vendors employ sampling-
In Wi-Fi networks, dynamic wireless environments are mainly
due to fading channels and collisions caused by random access based RA algorithms such as Minstrel HT [2] which is used
protocols. However, existing RA solutions mainly focus on the in the ath9k Wi-Fi driver and Iwl-Mvm-Rs in the Intel IwlWifi
adaptive capability of fading channels, resulting in conservative linux driver [3]. Sampling-based RA schemes usually select
RA policies and poor overall performance in highly congested the MCS whose historical behavior performs best. The major
networks. To address this problem, we propose a model-free drawback of sampling-based RAs is that MCSs can only be
deep reinforcement learning (DRL) based RA algorithm, named
as drlRA, in this work, which incorporates the impact of collisions evaluated with enough samples, in other words, it requires
into the reward function design. Numerical results show that to probe each MCS in a bunch of times. Such mechanisms
the proposed algorithm improves the throughput by 16.5% and may not respond promptly to highly dynamic wireless envi-
39.5% while reducing the latency by 25% and 19.3% compared ronments.
to state-of-the-art baselines. To overcome the shortcomings of conventional RA algo-
Index Terms—Deep reinforcement learning, rate adaptation,
Wi-Fi, CSMA/CA, MCS rithms, recent studies have embraced artificial intelligence (AI)
by exploiting its capability of prediction, i.e., learning the
I. I NTRODUCTION intrinsic relationship between wireless environmental observa-
Wireless channel conditions are unstable due to the im- tions and MCS selection1 scheme. In [1], [4], supervised learn-
pact of path-loss, noise, shadowing, fading, interference and ing (SL) based RA algorithms were investigated, which have
radio-frequency chain impairments in wireless communication been shown to achieve significant potential gains. Nonetheless,
systems. To better utilize the wireless resources, rate adap- SL may be limited by generalization in different wireless
tation has become one of the mandatory functionalities in environments, due to the lack of online learning [5].
IEEE 802.11 wireless local area networks (WLANs), which With the rapid development of reinforcement learning (RL),
adaptively selects modulation and coding schemes (MCS) RA schemes based on RL were proposed. In [6]–[8], the RA
based on the quality of the wireless channels. Each MCS is problem was formulated as the multi-armed bandit (MAB)
associated with a coding rate and constellation size, which problem, where each candidate MCS is encoded as a discrete
has a given bit rate. To increase single-link throughput, more arm of a MAB. Thompson sampling was utilized with the
number of antennas, wider channel bandwidth and higher- advantage of faster convergence speed. However, the scenarios
order modulation are adopted in current IEEE 802.11 networks of interest in these works are cellular communication systems
(i.e., IEEE 802.11ax or Wi-Fi 6), and the number of available operating on licensed spectrum, which solely consider channel
MCSs has increased significantly. Therefore, the rate adapta- fading due to mobility and multi-path effects. In IEEE 802.11
tion algorithm is of great importance. networks operating on unlicensed spectrum, the dynamics of
Conventional rate adaptation (RA) schemes for IEEE 802.11 the radio environment is highly affected by interference from
networks are rule-based and can be roughly categorized as hidden nodes or other co-existence transmission technologies,
SNR-based and sampling-based [1]. As for SNR-based ap- and also collisions caused by random channel access mecha-
proach, the transmitter estimates the instantaneous signal-to- nisms in the Medium Access Control (MAC) layer, i.e., carrier
noise ratio (SNR) from physical layer and translates it to a sense multiple access with collision avoidance (CSMA/CA).
bit rate that can be supported by current channel conditions. In [5], a deep reinforcement learning (DRL) based RA
scheme was proposed, and its performance was verified on
The work of W. Lin was done during the internship in Wireless Tech- a commodity 802.11ac prototype. This work mainly focused
nology Lab, 2012 Laboratories, Huawei Technologies Co. Ltd. The work of on the throughput of nodes embedded with the DRL-based
X. Sun was supported in part by the National Key Research and Development RA. As shown in simulation results in SectionV-C1, it would
Program of China (2019YFE0114000), and in part by Guangdong Engineering
Technology Research Center for Integrated Space-Terrestrial Wireless Optical
Communication. 1 In this paper, MCS selection and rate adaptation are interchangeable.

978-1-6654-5468-1/22/$31.00 ©2022 IEEE

Authorized licensed use limited to: Eindhoven University of Technology. Downloaded on July 17,2023 at 06:17:47 UTC from IEEE Xplore. Restrictions apply.
deteriorate the overall network throughput. This is because also by the waiting time DW due to backoff. The optimum
the policy of those nodes would become conservative as MCS can be determined by computing
the network becomes congested, so that most nodes tend to
m∗ = argmax (1 − P ERm )Rm . (2)
choose a low MCS, resulting in longer airtime of channel and m∈M
inefficient spectrum utilization.
If both the channel distribution and collision distribution
In this work, we formulate RA problem in CSMA/CA are known, then P ERm and DW can be explicitly expressed,
networks as a Markov Decision Process (MDP) and develop and then m∗ can be calculated. However, it is complicated
a DRL-based RA algorithm, named drlRA for conciseness. to characterize the distribution of DW , thereby solving the
To overcome overly conservative policies, we design a reward RA problem is challenging in CSMA/CA networks. In the
function that does not penalize DRL actions due to collisions following, we leverage the model-free DRL-based solution.
in order to avoid inefficient MCS use in congested wireless
environments, details are elaborated in Section III-C. Extensive III. MDP F ORMULATION
simulation results show that the drlRA achieves throughput A. MDP Basics
enhancement of up to 39.5% and 16.5%, and latency reduction MDP is a classical formalization of sequential decision
of up to 19.3% and 25.8% compared to Minstrel HT [2] and making [10]. In MDP, an agent in a state st ∈ S, choose
the state-of-the-art RA algorithm [5], respectively. an action at ∈ A. The environment responds a new state
The remainder of this paper is organized as follows. Sec- st+1 ∈ S, and feedbacks a reward rt ∈ R. The goal of
tion II introduces the system model and problem description. MDP agent is to find the optimal policy π ∗ (s)
∞ to maximize the
Section III formulates the considered RA problem using MDP expected cumulative discounted return Eπ [ t=0 γ t rt ], where
framework. The drlRA algorithm is elaborated in Section IV. γ is a discount factor.
Simulation results are presented in Section V, followed by the To find the optimal policy, the action-value function, also
conclusion in Section VI. known as Q-value, is defined as the expected cumulative
discounted return from undertaking action a at state s, i.e.,
II. S YSTEM M ODEL AND P ROBLEM D ESCRIPTION ∞

Q(s, a) Eπ [ γ k rt+k |st = s, at = a].
We focus on downlink Wi-Fi networks where access points k=0
(APs) transmit packets (B bits of each) to their associated According to the Bellman Optimality Equation, Q∗ (s, a)
stations (STAs). The AP contends for one shared channel maxπ Qπ (s, a),
through the CSMA/CA protocol. Before each packet trans-
mission, a backoff counter (BOF) is selected randomly from Q∗ (s, a) = E[rt + γ max

Q∗ (st+1 , a )|st = s, at = a].
a
{0, ..., Wi −1}, where Wi is the contention window (CW) and
i ∈ {0, ..., K}. The BOF is decreased by one when the channel The optimal policy π ∗ can be derived from the optimal
is sensed idle at each time slot, and AP will transmit its packet action-value function by taking a greedy action, π ∗ (s) =
until the BOF counts down to zero. RA algorithm is responsi- arg maxa Q∗ (s, a). The Q∗ (s, a) can be approximated by Q-
ble for selecting an MCS m ∈ M {1, 2, · · · , M } to encode learning algorithm or Deep Q-Network (DQN) algorithm [11].
each packet. Correspondingly, the packet is transmitted at a The MDP model can be described as the 4-tuple <
rate of Cm , where C1 < ... < CM , and last for Dm = B/Cm S, A, r, γ >, whose definitions w.r.t the RA problem are
air time. If the packet is successfully decoded by the target elaborated hereafter.
STA, then AP receives an acknowledgement (ACK) and resets B. Action and States
CW to W0 ; otherwise, due to collisions or erroneous MCS The action of agent at the transmission instance t is defined
policy, CW will be doubled until it reaches WK . as at ∈ M.
The wireless signal experiences both large-scale and small- The observation of agent at t, ot ∈ O consists the informa-
scale fading. Log-distance path loss model and Nakagami- tion on last transmission, which can be denoted as
m fast fading model are assumed respectively, which are
consistent with the widely-used NS3 system-level simulator ot [at−1 , ACKt−1 , RSSt−1 , dt−1 ] , (3)
[9]. where at−1 , ACKt−1 ∈ {0, 1} and RSSt−1 are the action,
We consider the goal of the RA problem is to maximize the the indicator of received ACK or not and received signal
following objective function, strength (RSS) measured from the ACK signal. dt−1 represents
the number of time slots between the (t − 1)-th and the t-th
maximize (1 − P ERm )Rm , (1) time instance. The RSS is used to reflect channel conditions.
m∈M
The state of agent st ∈ S at t is the observation history of
where P ERm denotes packet error rate (PER) caused by agent, which is given by
choosing MCS m. Rm represents the average rate instead of
st [ot−T +1 , · · · , ot−2 , ot−1 ] , (4)
the rate Cm . Because in CSMA/CA networks, the throughput
is affected not only by the packet transmission time Dm , but where T is the length of observation history.

Authorized licensed use limited to: Eindhoven University of Technology. Downloaded on July 17,2023 at 06:17:47 UTC from IEEE Xplore. Restrictions apply.
C. Reward
Reward function is the core design in a DRL algorithm. ܳሺ‫ݏ‬௧ ǡ ͳሻ
‫ݏ‬௧
There are only two results for each packet at MCS m, i.e., )&

)&

)&

)&

)&

)&

)&

…
success or failure. With respect to MCS m, denote the PER
ܳሺ‫ݏ‬௧ ǡ ‫ܯ‬ሻ
as P ER[m], reward for success as rsucc [m], and reward for
failure as rf ail [m]. A direct design is to use throughput as the
reward function, like other existing work. However, this design Fig. 1: The architecture of the neural network
will cause overly conservative policies in a highly congested
wireless network, since nodes tend to choose a low-level MCS Algorithm 1 drlRA algorithm
in face of collisions.
To avoid this conservative policy, the agent should distin- Initialization: ε, γ, N , s0 , a0 , t = 0, cnt = 0, θ − = θ
guish between two types of packet errors, packet error caused for the transmission instance t = 1, 2, . . . do
by MCS selection and packet error caused by collisions. Compute st from st−1 , at−1 using (3) and (4)
This work uses P ER[m] and P ERm to distinguish two Store (st−1 , at−1 , rt−1 , st ) to experience memory (EM)
types of packet errors. P ER[m] contains the PER due to Input st to the NN in Fig.1 with θ and output Q
both wrong MCS policy (P ERm ) and collisions, and hence Generate action at from Q using ε-greedy policy
P ERm ≤ P ER[m]. A natural idea of designing reward Calculate the reward rt according to (5)
function is to set the expectation of the reward equivalent to the for each sample e = (s, a, r, s ) in EM do
objective function in (1). Hence, (1 − P ER[m]) × rsucc [m] + Compute y = r + γ maxa Q(s , a , s ; θ − )
P ER[m] × rf ail [m] = (1 − P ERm )Rm . Compute L(θ) = (y − Q(s, a; θ))2
By fixing2 rf ail [m] = 0, rsucc [m] is derived as Update θ by performing mini-batch gradient descent
end for
1 − P ERm
rsucc [m] = × Rm , ∀m ∈ M. (5) if (cnt mod N ) == 0 then
1 − P ER[m]
θ− ← θ
In a highly congested wireless network, collisions become the end if
main cause of the failure reception of packets. In this case, we cnt ← cnt + 1
have P ERm P ER[m]. As a result, a policy that chooses t←t+1
a high-level MCS will be encouraged according to (5). end for
In (5), Rm is estimated as
B
Rm = , (6)
Dm + DW A. Simulation Setup
where DW is calculated by the expectation of CW, i.e., DW = Simulation parameters are summarized in Table.I, where D
0.5∗Wi . P ERm can be estimated by means of a look-up table is the distance between transmitter and receiver. The rates of
that stores the relationship between SNR and P ERm [12]. The available MCSs ({Cm }) are listed in Table.II.
SNR can be derived as the ratio of the average RSS statistics We introduce the following algorithms as baselines, includ-
from ACKs to the energy level detected on the idle channel. ing the commonly used Minstrel HT [2], the experience driven
IV. A LGORITHM rate adaptation (EDRA) [5], and the newly proposed DRL-
With the definitions of the MDP tuples w.r.t the RA problem based RA algorithm mentioned above. The parameters for
at hand, we can use DQN to solve it. The pseudo-code on drlRA are shown in Table III. The parameters for the Minstrel
drlRA is summarized in Algorithm 1. are the exponentially weighted moving average (EWMA),
The neural network (NN) architecture is illustrated in Fig.1, sampling window and proportion of probing, which are set
which is a residual network containing seven fully-connected to be 0.75, 100ms and 10%, respectively. The EDRA contains
(FC) layers. The NN inputs the current state st and outputs two periods: probing period and transmission period. In our
Q [Q(st , 1), · · · , Q(st , M )]. Action at is selected using
ε-greedy policy where agent chooses actions greedily with
probability 1−ε and chooses actions randomly with probability 15m

ε to ensure convergence to an optimal policy. 6m

STA
15m
STA STA STA

V. P ERFORMANCE E VALUATION AP AP AP AP

In order to evaluate the effectiveness of the proposed drlRA

algorithm, simulation results under grid topology and random A1 A2 A3 A4
topology are presented in this section. The grid topology is Fig. 2: Grid Topology: four basic service sets (BSSs) are
pictorially illustrated in Fig.2. placed in a grid where the stars represent APs equipped with
2 If r
drlRA, the circles represent APs with fixed MCS m = 6, and
f ail [m] is set to be a negative value, then the agent can learn more
from failures. We leave this for future work. the squares represent STAs.

Authorized licensed use limited to: Eindhoven University of Technology. Downloaded on July 17,2023 at 06:17:47 UTC from IEEE Xplore. Restrictions apply.
TABLE I: Simulation Parameters TABLE III: Parameters of drlRA

Scenario Parameter Value Parameters Value

Time slot 9 μs Agent action-observation history length T 10
Size of each packet (B) 27000bit Experience memory size 500
Simulation time 10s Discount factor γ 0.9
CSMA/CA (W0 , WK ) (32, 1024) Batch size bs 32
Path loss model −46.67 − 30log10(D) Learning rate 5 × 10−4
Transmit power 10dBm Replace target iteration N 100
Traffic type Saturated Poisson Traffic Range of ε 0.5 to 0.01
Decay rate of ε 0.995
Sliding window size to calculate PER 100
TABLE II: Available MCSs and Rates

MCS 1 2 3 4 5
Rate(M bps/s) 8.6 17.2 25.8 34.4 51.6
MCS 6 7 8 9 10
Rate(M bps/s) 68.8 77.4 86.0 103.2 114.7

simulation, probing period and transmission period last for the

transmission of 5 and 18 packets, respectively.
B. Performance Metrics
The following metrics are used to evaluate the performance
(a) Throughput of Topo A
of the algorithms of interest.
• Total Throughput: Total throughput is defined as the ratio
of the total number of successfully transmitted bits in the
network to the simulation interval, which is set to be the
last five seconds.
• Mean Delay: Mean delay is defined as the average latency
of successfully transmitted packets, during the last five
seconds of simulation.
C. Simulation Results
1) Topo A: Topo A is a simple grid topology, where differ- (b) Delay of Topo A
ent settings (A1–A4) are deployed to examine the coexistence
Fig. 3: Performance comparison under Topo A. For Ax, x
performance with fixed MCS setups.
indicates the number of AP equipped with the RA algorithms
As shown in Fig. 3, the EDRA performs the worst in terms
of interest and the other APs all use fixed MCS.
of throughput. The reason is that when EDRA encounters
a high PER due to interference/collision, it will reduce the
rate to achieve a lower PER. However, such a conservative 2) Topo B: In this case, we evaluate the performance of
rate selection policy causes the channel to be occupied by the proposed drlRA algorithm under random topologies and
the low rate for a long time, which decreases the network more BSSs. APs and STAs are randomly dropped within a
throughput. This deterioration will become more pronounced 60m × 60m square. The coordinates of APs and STAs follow
as the number of EDRA nodes increases. a Poisson Point Distribution. Each topology with different
The proposed drlRA algorithm outperforms EDRA and number of APs is generated ten times in total. As shown in Fig.
Minstrel by up to 39.5% and 16.5% respectively in terms 4, both DRL-based schemes (EDRA and drlRA) outperform
of throughput, and reduces the delay by up to 19.2% and the conventional rule-based Minstrel algorithm. Although there
25.9%. This enhancement comes primarily from the reward is no significant performance gain compared to EDRA in terms
design in (5), which distinguishes the PER due to wrong MCS of average latency, the drlRA delivers significant throughput
policy from collisions. The gap between P ER[m] and P ERm gains of approximately 23.6% ∼ 47% in different settings.
indicates that the main cause of packet error is interference.
Therefore, the drlRA will maintain a high rate to avoid mutual D. Convergence and Complexity
interference. If the transmitters all choose a high rate with The convergence performance of the proposed drlRA can
short transmission airtime, then collisions will be alleviated. It be observed from Fig.5, where the convergence time is less
can be seen that the network maintains high throughput when than 1 second.
the number of drlRA nodes increases. As a result, drlRA can For the NN architecture adopted in our work, i.e. shown in
avoid overly conservative policies. Fig. 1, by counting all floating-point operations in inference,

Authorized licensed use limited to: Eindhoven University of Technology. Downloaded on July 17,2023 at 06:17:47 UTC from IEEE Xplore. Restrictions apply.
based algorithm–EDRA. Experimental results show that drlRA
achieves higher overall throughput and lower latency.
Moving forward, we are interested in designing reward
functions with negative values in the case of failure. This
would help agents learn from their failures. Another direction
to further improve the performance is to joint intelligent RA
algorithm and intelligent channel access such as [13].
R EFERENCES
[1] S. Khastoo, T. Brecht, and A. Abedi, “Neura: Using neural networks to
improve wifi rate adaptation,” in Proceedings of the 23rd International
(a) Throughput of Topo B ACM Conference on Modeling, Analysis and Simulation of Wireless and
Mobile Systems. New York, NY, USA: Association for Computing
Machinery, 2020, pp. 161–170.
[2] R. Albar, T. Y. Arif, and R. Munadi, “Modified rate control for collision-
aware in minstrel-ht rate adaptation algorithm,” in 2018 International
Conference on Electrical Engineering and Informatics (ICELTICs),
2018, pp. 7–12.
[3] R. Grünblatt, I. Guérin-Lassous, and O. Simonin, “Simulation and
performance evaluation of the intel rate adaptation algorithm,” in
Proceedings of the 22nd International ACM Conference on Modeling,
Analysis and Simulation of Wireless and Mobile Systems, ser. MSWIM
’19. New York, NY, USA: Association for Computing Machinery, 2019,
p. 27?34. [Online]. Available: https://doi.org/10.1145/3345768.3355921
[4] C.-Y. Li, S.-C. Chen, C.-T. Kuo, and C.-H. Chiu, “Practical machine
(b) Delay of Topo B learning-based rate adaptation solution for wi-fi nics: Ieee 802.11ac as a
case study,” IEEE Transactions on Vehicular Technology, vol. 69, no. 9,
Fig. 4: Performance comparison under Topo B. For Bx, x pp. 10 264–10 277, 2020.
[5] S.-C. Chen, C.-Y. Li, and C.-H. Chiu, “An experience driven design for
indicates the number of BSSs. ieee 802.11ac rate adaptation based on reinforcement learning,” in IEEE
INFOCOM 2021 - IEEE Conference on Computer Communications,
2021, pp. 1–10.
[6] V. Saxena, H. Tullberg, and J. Jalden, “Reinforcement learning for
efficient and tuning-free link adaptation,” IEEE Transactions on Wireless
Communications, vol. 21, no. 2, pp. 768–780, 2022.
[7] J. Park and S. Baek, “Two-stage thompson sampling for outer-loop link
adaptation,” IEEE Wireless Communications Letters, vol. 10, no. 9, pp.
2004–2008, 2021.
[8] V. Saxena, H. Tullberg, and J. Jalden, “Model-based adaptive modu-
lation and coding with latent thompson sampling,” in 2021 IEEE 32nd
Annual International Symposium on Personal, Indoor and Mobile Radio
Communications (PIMRC), 2021, pp. 610–616.
[9] G. F. Riley and T. R. Henderson, The ns-3 Network Simulator, 2010.
[10] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction.
MIT press, 2018.
[11] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G.
Fig. 5: Convergence performance of the proposed drlRA Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski
et al., “Human-level control through deep reinforcement learning,”
algorithm in A3 at Topo A nature, vol. 518, no. 7540, pp. 529–533, 2015.
[12] P. H. Tan, Y. Wu, and S. Sun, “Link adaptation based on adaptive mod-
ulation and coding for multiple-antenna ofdm system,” IEEE Journal
the computational complexity of inference is 47880 floating- on Selected Areas in Communications, vol. 26, no. 8, pp. 1599–1606,
2008.
point operations (FLOPs). The seven-layer NN may not be [13] Z. Guo, Z. Chen, P. Liu, J. Luo, X. Yang, and X. Sun, “Multi-
necessary, and there is still room to reduce FLOPs by opti- agent reinforcement learning-based distributed channel access for next
mizing the NN architecture. generation wireless networks,” IEEE Journal on Selected Areas in
Communications, vol. 40, no. 5, pp. 1587–1599, 2022.

VI. C ONCLUSION AND F UTURE W ORK

In this work, we propose a new DRL-based RA algorithm,
named drlRA, for Wi-Fi networks. Unlike existing DRL-
based solutions, the drlRA takes consideration of the impact
of collisions caused by random channel access on the MCS
selection policy. Especially in the reward function, the drlRA
incorporates the waiting time in average throughput expression
and distinguishes the packet errors from collisions and from er-
roneous MCS decisions. We compare drlRA performance with
classical RA algorithm–Minstrel and state-of-the-art DRL-

Authorized licensed use limited to: Eindhoven University of Technology. Downloaded on July 17,2023 at 06:17:47 UTC from IEEE Xplore. Restrictions apply.

AWS Certified Data Engineer
No ratings yet
AWS Certified Data Engineer
693 pages
TIA-102.CAAB-D Performance Recommendations Digital Radio Technology C4FM-CQPSK Modulation
No ratings yet
TIA-102.CAAB-D Performance Recommendations Digital Radio Technology C4FM-CQPSK Modulation
72 pages
DataDog EBookMonitoringModernInfrastructure
No ratings yet
DataDog EBookMonitoringModernInfrastructure
86 pages
06 RA47076EN70NAA0 LTE Link Budget
No ratings yet
06 RA47076EN70NAA0 LTE Link Budget
70 pages
Enhanced DL Scheduling Algorithm Adaptation V2.0
No ratings yet
Enhanced DL Scheduling Algorithm Adaptation V2.0
17 pages
PHD Thesis Sandesh
No ratings yet
PHD Thesis Sandesh
213 pages
03 IB DP CS Networks
No ratings yet
03 IB DP CS Networks
164 pages
Dynamic Rate Adaptation in 802.11 WLANs - 2008
No ratings yet
Dynamic Rate Adaptation in 802.11 WLANs - 2008
60 pages
Computer Networks Lab Manual
No ratings yet
Computer Networks Lab Manual
72 pages
Chapter 1 8.1 L (1-2)
No ratings yet
Chapter 1 8.1 L (1-2)
87 pages
Unit 1 Introduction To Layer Functionality and Design Issues
No ratings yet
Unit 1 Introduction To Layer Functionality and Design Issues
89 pages
Nozomi - O&M Manual
No ratings yet
Nozomi - O&M Manual
84 pages
Deep Reinforcement Learning Approach To MIMO Precoding Problem: Optimality and Robustness
No ratings yet
Deep Reinforcement Learning Approach To MIMO Precoding Problem: Optimality and Robustness
30 pages
Deep Reinforcement Learning-Based Adaptive Scheduling
No ratings yet
Deep Reinforcement Learning-Based Adaptive Scheduling
21 pages
COMPUTER NETWORKS Physical Layer
No ratings yet
COMPUTER NETWORKS Physical Layer
36 pages
Deep Reinforcement Learning For Multi-User
No ratings yet
Deep Reinforcement Learning For Multi-User
33 pages
A Robust Comparative Analytical Study of NICs and Its Effect On Internet Network Services
No ratings yet
A Robust Comparative Analytical Study of NICs and Its Effect On Internet Network Services
19 pages
10 1109@comst 2020 2965856 PDF
No ratings yet
10 1109@comst 2020 2965856 PDF
46 pages
Energy-Efficient Rate-Splitting Multiple Access A Deep Reinforcement Learning-Based Framework
No ratings yet
Energy-Efficient Rate-Splitting Multiple Access A Deep Reinforcement Learning-Based Framework
13 pages
Huawei WLAN Products and Solution
No ratings yet
Huawei WLAN Products and Solution
57 pages
Chapter 4
No ratings yet
Chapter 4
51 pages
Progress
No ratings yet
Progress
30 pages
DC - Lecture 3
No ratings yet
DC - Lecture 3
33 pages
Deep Reinforcement Learning For Mobile 5G and Beyond Fundamentals Applications and Challenges
No ratings yet
Deep Reinforcement Learning For Mobile 5G and Beyond Fundamentals Applications and Challenges
16 pages
5 - Application-Level Data Rate Adaptation in Wi-Fi
No ratings yet
5 - Application-Level Data Rate Adaptation in Wi-Fi
9 pages
The Frontiers of Deep Reinforcement Learning For Resource Management in Future Wireless HetNets Techniques Challenges and Research Directions
No ratings yet
The Frontiers of Deep Reinforcement Learning For Resource Management in Future Wireless HetNets Techniques Challenges and Research Directions
44 pages
Emilio Ancillotti, Raffaele Bruno, Marco Conti IIT - CNR, Pisa, Italy Speaker: Chiara Boldrini Questions To: Raffaele - Bruno@iit - Cnr.it
No ratings yet
Emilio Ancillotti, Raffaele Bruno, Marco Conti IIT - CNR, Pisa, Italy Speaker: Chiara Boldrini Questions To: Raffaele - Bruno@iit - Cnr.it
21 pages
Using Kafka For Real Time Data Ingestion With .NET KevinFeasel
No ratings yet
Using Kafka For Real Time Data Ingestion With .NET KevinFeasel
33 pages
RL - DRL - 2019 - Multiple Access For Heterogeneous Wireless Networks
No ratings yet
RL - DRL - 2019 - Multiple Access For Heterogeneous Wireless Networks
14 pages
Lecture08 Mac Link Adaptation
No ratings yet
Lecture08 Mac Link Adaptation
8 pages
Graph Neural Networks Approach For Joint Wireless Power Control and Spectrum Allocation
No ratings yet
Graph Neural Networks Approach For Joint Wireless Power Control and Spectrum Allocation
16 pages
Multi Agent Dynamic Resource Allocation in 6G in X With Subnetworks
No ratings yet
Multi Agent Dynamic Resource Allocation in 6G in X With Subnetworks
15 pages
Hydraulic Profile
No ratings yet
Hydraulic Profile
24 pages
Om-Chapter 1
No ratings yet
Om-Chapter 1
13 pages
Deepchannel: Wireless Channel Quality Prediction Using Deep Learning
No ratings yet
Deepchannel: Wireless Channel Quality Prediction Using Deep Learning
14 pages
MX 772000 PC
No ratings yet
MX 772000 PC
25 pages
Federated Low-Rank Adaptation For Large Models Fine-Tuning Over Wireless Networks
No ratings yet
Federated Low-Rank Adaptation For Large Models Fine-Tuning Over Wireless Networks
17 pages
Stability RCP DC7
No ratings yet
Stability RCP DC7
18 pages
No-Pain No-Gain DRL Assisted Optimization in Energy-Constrained CR-NOMA Networks
No ratings yet
No-Pain No-Gain DRL Assisted Optimization in Energy-Constrained CR-NOMA Networks
16 pages
Cross Layer Optimization
No ratings yet
Cross Layer Optimization
43 pages
Deep Reinforcement Learning Based End-to-End Multiuser Channel Prediction and Beamforming
No ratings yet
Deep Reinforcement Learning Based End-to-End Multiuser Channel Prediction and Beamforming
15 pages
Telecom 2g 3g 4g RF Lte Drive Test Optimization Ipv6 Study Materials Lte Drive Test Parameters PDF Free
No ratings yet
Telecom 2g 3g 4g RF Lte Drive Test Optimization Ipv6 Study Materials Lte Drive Test Parameters PDF Free
4 pages
!reinforcement Learning For Optimizing Wi-Fi Access Channel Selection
No ratings yet
!reinforcement Learning For Optimizing Wi-Fi Access Channel Selection
14 pages
Compressive Sensing-Based Adaptive Active User Detection and Channel Estimation Massive Access Meets Massive MIMO
No ratings yet
Compressive Sensing-Based Adaptive Active User Detection and Channel Estimation Massive Access Meets Massive MIMO
16 pages
Energy Consumption Optimization
No ratings yet
Energy Consumption Optimization
23 pages
Adaptive Routing in Wireless Mesh Networks Using Hybrid Reinforcement Learning Algorithm
No ratings yet
Adaptive Routing in Wireless Mesh Networks Using Hybrid Reinforcement Learning Algorithm
19 pages
On Adaptive Transmission
No ratings yet
On Adaptive Transmission
14 pages
WebRTC Metrics Report 2017 03 1646
No ratings yet
WebRTC Metrics Report 2017 03 1646
25 pages
Adaptive CSI Feedback Estimation LTE
No ratings yet
Adaptive CSI Feedback Estimation LTE
14 pages
LLM in A Flash: Efficient Large Language Model Inference With Limited Memory
No ratings yet
LLM in A Flash: Efficient Large Language Model Inference With Limited Memory
12 pages
Offline and Distributional Reinforcement Learning For Wireless Communications
No ratings yet
Offline and Distributional Reinforcement Learning For Wireless Communications
7 pages
Deep Reinforcement Learning For Mobile 5G and Beyond Fundamentals Applications and Challenges
No ratings yet
Deep Reinforcement Learning For Mobile 5G and Beyond Fundamentals Applications and Challenges
9 pages
RNN-Based Radio Resource Management On
No ratings yet
RNN-Based Radio Resource Management On
14 pages
Energy Efficiency in Open RAN: RF Channel Reconfiguration Use Case
No ratings yet
Energy Efficiency in Open RAN: RF Channel Reconfiguration Use Case
9 pages
Stateless Reinforcement Learning
No ratings yet
Stateless Reinforcement Learning
5 pages
Learning To Allocate Radio Resources in Mobile 6G in X Subnetworks FV
No ratings yet
Learning To Allocate Radio Resources in Mobile 6G in X Subnetworks FV
8 pages
Adaptive Reinforcement Learning-Based Routing Protocol For Wireless Multihop Networks
No ratings yet
Adaptive Reinforcement Learning-Based Routing Protocol For Wireless Multihop Networks
10 pages
Deep Reinforcement Learning For RAN Optimization and Control
No ratings yet
Deep Reinforcement Learning For RAN Optimization and Control
6 pages
Optimal Rate Sampling in 802.11 Systems: Theory, Design, and Implementation
No ratings yet
Optimal Rate Sampling in 802.11 Systems: Theory, Design, and Implementation
15 pages
Performance Analysis of The Statistical Priority-Based Multiple Access
No ratings yet
Performance Analysis of The Statistical Priority-Based Multiple Access
6 pages
ICC - ROBINS 2024 Paper 172
No ratings yet
ICC - ROBINS 2024 Paper 172
9 pages
Deep Reinforcement Learning For Communication Flow Control in Wireless Mesh Networks
No ratings yet
Deep Reinforcement Learning For Communication Flow Control in Wireless Mesh Networks
8 pages
Tingpei Huang, Haiming Chen, Li Cui and Zhaoliang Zhang
No ratings yet
Tingpei Huang, Haiming Chen, Li Cui and Zhaoliang Zhang
13 pages
Tham 2019
No ratings yet
Tham 2019
4 pages
RFRL Gym-A Reinforcement Learning Testbed For
No ratings yet
RFRL Gym-A Reinforcement Learning Testbed For
8 pages
Decentralized Deep Reinforcement Learning Approach
No ratings yet
Decentralized Deep Reinforcement Learning Approach
6 pages
A Reinforcement Learning Approach For Scheduling in Mmwave Networks
No ratings yet
A Reinforcement Learning Approach For Scheduling in Mmwave Networks
6 pages
Lecture08 Mac Link Adaptation
No ratings yet
Lecture08 Mac Link Adaptation
8 pages
Fortigate 70F Series: Data Sheet
No ratings yet
Fortigate 70F Series: Data Sheet
6 pages
Fortigate 60d 3g4g VZW
No ratings yet
Fortigate 60d 3g4g VZW
4 pages
2011 - Qingjiang Shi - AnIterativelyWeightedMMSEApproachtoDistributedSumU (Retrieved 2020-11-28) PDF
No ratings yet
2011 - Qingjiang Shi - AnIterativelyWeightedMMSEApproachtoDistributedSumU (Retrieved 2020-11-28) PDF
10 pages
Defeating Proactive Jammers Using Deep Reinforcement Learning For Resource-Constrained IoT Networks
No ratings yet
Defeating Proactive Jammers Using Deep Reinforcement Learning For Resource-Constrained IoT Networks
6 pages
Fingerprinting 802.11 Rate Adaptation Algorithms
No ratings yet
Fingerprinting 802.11 Rate Adaptation Algorithms
9 pages
Deep-Learning Assisted Cross-Layer Routing in Multi-Hop Wireless Network
No ratings yet
Deep-Learning Assisted Cross-Layer Routing in Multi-Hop Wireless Network
5 pages
Deep Learning-Based Cross-Layer Resource Allocation For Wired Communication Systems
No ratings yet
Deep Learning-Based Cross-Layer Resource Allocation For Wired Communication Systems
5 pages
A Comparative Analysis of Deep Reinforcement Learning-Based Xapps in O-RAN 2309.05621
No ratings yet
A Comparative Analysis of Deep Reinforcement Learning-Based Xapps in O-RAN 2309.05621
6 pages
DDQN 1
No ratings yet
DDQN 1
5 pages
A Study of 802.11 Bitrate Selection in Linux: Robert Copeland
No ratings yet
A Study of 802.11 Bitrate Selection in Linux: Robert Copeland
8 pages
Deep-Reinforcement Learning Multiple Access For Heterogeneous Wireless Networks
No ratings yet
Deep-Reinforcement Learning Multiple Access For Heterogeneous Wireless Networks
7 pages
Federated Deep Reinforcement Learning For User Access Control in Open Radio Access Networks
No ratings yet
Federated Deep Reinforcement Learning For User Access Control in Open Radio Access Networks
6 pages
Kam 2016
No ratings yet
Kam 2016
5 pages
Collision-Aware Rate Adaptation Algorithm For High-Throughput Ieee 802.11N Wlans
No ratings yet
Collision-Aware Rate Adaptation Algorithm For High-Throughput Ieee 802.11N Wlans
6 pages
Deep Learning On QoE
No ratings yet
Deep Learning On QoE
6 pages
Channel Estimation in The Interplanetary Internet Using Deep Learning and Federated Learning SJ
No ratings yet
Channel Estimation in The Interplanetary Internet Using Deep Learning and Federated Learning SJ
4 pages
Detection of Misbehavior Nodes in Wifi Networks
No ratings yet
Detection of Misbehavior Nodes in Wifi Networks
5 pages
Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning
No ratings yet
Contention Window Optimization in IEEE 802.11ax Networks With Deep Reinforcement Learning
4 pages
CS Datasheet COMET
No ratings yet
CS Datasheet COMET
2 pages
3 Weeks 6 7 ULO A Analyze 1 Key 1
No ratings yet
3 Weeks 6 7 ULO A Analyze 1 Key 1
2 pages

2 - Deep Reinforcement Learning Based Rate Adaptation For Wi-Fi Networks

Uploaded by

2 - Deep Reinforcement Learning Based Rate Adaptation For Wi-Fi Networks

Uploaded by

Deep Reinforcement Learning based Rate

Adaptation for Wi-Fi Networks

Email: [email protected], [email protected], [email protected]

978-1-6654-5468-1/22/$31.00 ©2022 IEEE

ε to ensure convergence to an optimal policy. 6m

In order to evaluate the effectiveness of the proposed drlRA

Scenario Parameter Value Parameters Value

simulation, probing period and transmission period last for the

VI. C ONCLUSION AND F UTURE W ORK

You might also like