Building Energy Efficient Semantic Segmentation in Intelligent Edge Computing
Building Energy Efficient Semantic Segmentation in Intelligent Edge Computing
This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
Abstract—Semantic segmentation is a critical area in computer and complex model deployments. This leads to increased
vision, which needs voluminous image data streaming from energy consumption for edge devices handling such tasks.
user devices. Usually, it is challenging to process semantic Furthermore, the energy supply for edge devices is always
segmentation tasks in user devices due to the limited computation
power and battery life. Intelligent edge computing effectively limited, making energy efficiency a crucial subject. Therefore,
enhances the accuracy of semantic segmentation tasks by offload- reducing energy consumption to ensure prolonged and stable
ing computations to nearby devices, providing lower latency and operation of edge devices is an essential aspect of deploying
improved responsiveness. However, inefficient offloading brings semantic segmentation technologies, regardless of the specific
additional energy consumption due to the irregular relationship environment or application [11].
between task requirements and offloading settings. In this paper,
we attempt to improve energy efficiency for processing semantic Research relevant to the field of edge computing and se-
segmentation tasks in the edge environment by leveraging energy mantic segmentation emphasizes optimizing task offloading
consumption and task requirements. We first investigate the locations to address energy efficiency challenges in edge
power consumption with different offloading settings in a real
computing [12], [13]. For semantic segmentation, optimizing
intelligent edge environment. Based on the investigation, we
formulate the offloading setting as a restricted multi-armed offloading parameters seems a big opportunity to improve
bandit problem and solve it by enhancing the upper confidence performance efficiency. Semantic segmentation performance
bound algorithm. Comprehensive simulation results show that the can be substantially affected by various parameters, originating
proposed solution significantly improves the energy efficiency for from both the user, e.g., image resolution, and the server,
offloading semantic segmentation tasks in a given intelligent edge
e.g., neural network model complexity [14]. Specifically, ac-
environment.
curacy requirements from the user perspective and power
Index Terms—Intelligent Edge Computing, Semantic Segmen- consumption constraints from the server side are intricately
tation, Multi-Armed Bandit (MAB), Energy Efficiency
interconnected. For instance, processing low-resolution or
compressed frames reduces power consumption but simulta-
I. I NTRODUCTION neously compromises the semantic accuracy of the task [15]–
[17]. Therefore, assigning suitable parameters to tasks before
EMANTIC segmentation, an essential branch in the field
S of computer vision, groups pixels according to different
semantic meanings in images. Many important fields such as
offloading can lead to considerable energy efficiency.
We conduct preliminary experiments in an edge computing
video surveillance [1], medical image analysis [2], and plant testbed to identify the offloading parameters that influence
disease detection [3] require semantic segmentation models task accuracy and power consumption. With these crucial
to process a large number of different data streams. Usually, parameters identified, we through alter parameters to observe
most segmentation applications are deployed in high-end cloud the resulting changes in power consumption and task accuracy.
servers rather than local devices due to the enormous compu- Appropriate assignment of parameters, such as video format,
tations required for processing high-accuracy models [4]. neural network size, and expected accuracy, can significantly
Intelligent edge computing deploys computing servers enhance energy efficiency, as evidenced by the experimental
closer to the data source, i.e., at the edge of the network, results in Section III. Therefore, after the edge computing
supporting real-time semantic segmentation in local devices system decides how to assign the appropriate parameters to
and mitigating the latency associated with cloud-based pro- tasks, the system can run with lower power consumption [18].
cessing [5]–[9]. This advanced form of edge computing uses However, it is challenging to assign appropriate parameters
artificial intelligence and machine learning algorithms for automatically for offloading semantic segmentation tasks in
analyzing data at the edge layer, enabling smarter decision- edge computing due to the task complexity. Unlike assigning
making and more efficient resource allocation [10]. However, general computing tasks in edge servers, semantic segmenta-
unlike other AI tasks deployed in edge environments, seman- tion tasks have multiple different configurable parameters with
tic segmentation tasks require more computational resources some unknown effects between them [19].
As a result, in this paper, we propose a learning strategy
X. Yuan, H. Li, K. Ota and M. Dong are with the Department of to improve the energy efficiency for offloading semantic
Sciences and Informatics, Muroran Institute of Technology, Muroran, 050-
0071, Japan. E-mail: {[email protected], [email protected], segmentation tasks in edge computing. We first formulate the
[email protected], [email protected]}. learning problem, finding the energy consumption with differ-
0000–0000/00$00.00 © 2021 IEEE
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
ent effects parameters for offloading semantic segmentation techniques to minimize energy consumption while ensuring
tasks in intelligent edge computing as a multi-armed bandit the quality of service. Optimizing task offloading and resource
(MAB) problem. We then propose the UCB-CoR algorithm allocation in edge computing systems is one approach to
to solve the formulated MAB problem. Here, CoR stands for improving energy efficiency. Raza et al. [21] propose a scheme
"Clustering of Regions", which refers to the pre-partitioning for Vehicular Edge Computing that optimizes task offload-
of parameters to reduce the iterations in the exploration of the ing and resource allocation, considering latency and energy
UCB algorithm [20]. The motivation for adopting the UCB consumption while maximizing computational efficiency. An-
algorithm in our work is its well-known theoretical guarantees other research direction emphasizes creating energy-efficient
and proven performance in a wide range of applications. The algorithms and protocols for edge computing. Battiloro et al.
UCB algorithm has been shown to achieve near-optimal regret [22] suggest a dynamic optimization framework for adaptive
bounds in the asymptotic setting, making it an attractive choice federated learning, concentrating on the integration of com-
for scenarios where the number of arms is large, and the munication, computation, and learning aspects. This method
horizon is long. Additionally, the UCB algorithm has a rela- optimizes resources, enables continuous learning, and adapts
tively simple implementation, which makes it computationally to various performance metrics, ultimately achieving energy-
efficient and easy to use in practice. We will demonstrate efficient, low-latency, and adaptive learning at the wireless
this in Section VI. The time complexity of our algorithm is network edge. Energy harvesting has also been explored as
O(|Z|T ), where |Z| is the number of arms (i.e., actions) and a potential solution for energy efficiency in edge comput-
T is the total round. The main contributions of this paper are ing. Zhang et al. [23] present a dynamic task assignment
as follows. algorithm for Mobile Edge Computing (MEC) systems with
1) We first conduct a preliminary experiment to assess the energy harvesting, optimizing energy consumption and exe-
energy consumption associated with various parameters cution delay while ensuring battery stability and improved
when offloading semantic segmentation tasks. The pre- computational performance. Li et al. [24] propose a multi-
liminary results reveal that appropriate parameter as- relay assisted computation offloading framework for mobile
signment substantially enhances the energy efficiency of edge computing energy harvest systems, which uses multiple
offloading semantic segmentation tasks. neighboring nodes as relay nodes to improve task offloading in
2) We then formulate a MAB problem to learn optimal poor channel conditions. This method addresses the issues of
parameters for offloading semantic segmentation tasks in increased execution time and task failure rates in such systems.
an intelligent edge environment. To address the MAB Efficient semantic segmentation: Semantic segmentation
problem, we propose the UCB-CoR algorithm, which is an essential computer vision task that requires substantial
effectively identifies the most suitable parameters while computational resources. Researchers have proposed energy-
considering different constraints. efficient techniques to optimize the performance of semantic
3) Lastly, we evaluate the feasibility and robustness of the segmentation models. Wang et al. [25] proposed a framework
proposed algorithm through comprehensive simulations, that addresses efficiency problems in traditional video trans-
examining its performance under the influence of multiple mission systems by exploiting temporal correlations among
parameters. The experimental results demonstrate that video frames and utilizing context-based encoding. This ap-
the UCB-CoR algorithm outperforms other algorithms, proach enhances video transmission performance. Yang et al.
indicating its advantage in this context. [26] proposed a pertinent alternating algorithm for wireless
The rest of this paper is organized as follows. We describe resource allocation and semantic information extraction, aim-
the related work on semantic segmentation and intelligent ing to achieve energy-efficient semantic communications in
edge computing in Section II. Section III presents our pre- wireless networks with rate splitting. Their method seeks to
experimental results for offloading semantic segmentation minimize the network’s total communication and computation
tasks in an intelligent edge computing testbed. The MAB energy consumption, taking into account computation, latency,
problem formulation is described in Section IV. Section V and transmit power constraints. This leads to more energy-
introduces the UCB-CoR algorithm. Performance evaluation efficient communication overall. Furthermore, Kang et al. [27]
of the proposed algorithm is discussed in Section VI. Finally, developed an energy-efficient, task-oriented semantic commu-
we conclude this paper in Section VII. nication framework that addresses challenges associated with
energy-intensive and efficiency-limited image retrieval and se-
II. R ELATED WORK mantic encoding without taking user personality into account.
Edge computing has been a significant research area for This framework significantly improves the personalization,
more than 15 years. In recent years, however, the focus anti-interference performance, and communication quality of
on energy-efficient semantic segmentation, particularly in the semantic communication services, making it well-suited for
context of edge computing, has increased. The combined various applications.
emphasis on these areas has led to advancements in improving Semantic segmentation on edge computing: Ahamad
the performance of edge devices while also reducing energy et al. [28] propose a quantization algorithm for semantic
consumption. In this section, we review several related works segmentation deep learning architectures, which reduces the
that explore these aspects. parameter size by four to eight times while maintaining
Energy efficiency for edge computing: Several works minimal impact on accuracy. This solution enables the efficient
have studied energy efficiency in edge computing, proposing implementation of these architectures on low-power, memory-
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
35 25
Nano Nano Xavier NX
Xavier NX Xavier NX Xavier AGX
30 Xavier AGX Xavier AGX
Power Consumption (W) 100% 20
GPU Utilization
75% 15
20
15
50% 10
10
25% 5
5
0 0 0
0 25% 50% 100% 1 2 3 4 1 2 3 4
GPU Utilization Number of Running Network Number of Running Network
(a) GPU Utilization and Power Consumption (b) Number of Network and GPU Utilization (c) Power Consumption of NX and AGX
Fig. 1. Relationship between GPU utilization, power consumption and number of running network
20
55 88
512 50 55.8 51.5 512 4 6 20.5 512 87.2 88.7 87.4
Video Resolution
Video Resolution
Video Resolution
86
480 49.5 55.4 49.9 50 480 3.9 5.7 18.3 15 480 87.1 88.7 87.4
84
320 51.6 53.7 56.5 320 2.9 3.7 13 320 87.1 87.5 88.8
45 10 82
256 52.1 54 55.8 256 2.6 3.1 8.8 256 86.1 87.1 88.6 80
40 5
128 36.8 37.5 42.7 128 2.3 2.5 3.7 128 77.5 76.6 81.4 78
Fig. 2. Mean Intersection over Union, power consumption, and global accuracy for various neural network image resolutions and depth of the neural network
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
54 60 58
52 56
Mean Intersection over Union
Compression Rate: 20 46
40 Compression Rate: 20 40 Compression Rate: 20
Compression Rate: 100 Compression Rate: 100 44
38 Compression Rate: 100
36 35 42
128 256 320 480 512 128 256 320 480 512 128 256 320 480 512
Image Resolution Image Resolution Image Resolution
Fig. 3. The mIoU values of videos with different resolutions after being compressed
consumption of the AGX was 1.3x that of the NX during a Consequently, the mean Intersection over Union (mIoU) can
single task operation. Interestingly, this disparity diminished be expressed as
as the number of tasks increased, reducing to a 1.1x difference k
during a four-task operation. When considered with Fig. 1b, 1 X TP
mIoU = . (2)
these findings suggested significantly lower GPU utilization k + 1 i=0 F N + F P + T P
for the Jetson Xavier AGX in comparison to the Jetson Xavier
NX when managing four tasks. Consequently, we consider that Fig. 2a and 2c present the mIoU and global accuracy
the Jetson Xavier AGX was a more suitable option for our for various resolutions and neural network depths. Different
experimental context. from Fig. 2b, neither mIoU nor global accuracy consistently
Upon selecting the appropriate edge servers for our ex- increases with video resolution or neural network depth.
periment, we sought to identify the configurable parameters The Result reveals that mIoU may decrease as resolution
influencing the server’s power consumption. To this end, increases while maintaining the same neural network depth.
we deployed various neural networks, specifically ResNet- The maximum mIoU and global accuracy are observed for
18, ResNet-34, and ResNet-50 [30], which are commonly ResNet-50 with resolution of 320. It proves that the changes
used for semantic segmentation tasks, on the Jetson AGX in mIoU and global accuracy are not linearly related to either
Xavier. Additionally, we tested the neural networks at five the video resolution or the depth of neural networks. This
video resolutions, namely 128, 256, 320, 480, and 512. Fig. 2b non-monotonic relationship underscores the complexity of the
shows the power consumption for each resolution and neural trade-off situation. Optimizing semantic segmentation perfor-
network depth combination. We can see that increasing neural mance in edge computing environments while maintaining
network image resolution and depth increases the edge server’s energy efficiency is, therefore, a complex task.
power consumption. When employing ResNet-50 to process In addition, we also used videos with varying levels of
images at a resolution of 512x512, the power consumption compression as a test parameter to evaluate the performance
peaks at 20 watts. Notably, the power consumption difference of various semantic segmentation models. A compression rate
between ResNet-18 and ResNet-34 is minimal, whereas the of 100 represents the original, uncompressed video, while a
consumption for ResNet-50 is substantially higher than both. value of 20 denotes a video compressed to 20% of the original.
Fig. 3a-3c show the mIoU values of videos with different
In semantic segmentation tasks, two important evaluation
resolutions after being compressed. Evidently, in most cases,
metrics are often used: the mean Intersection over Union
videos with a compression rate of 100 yield higher mIoU than
(mIoU) and global accuracy. These metrics could potentially
those with a compression rate of 20. The only exception is
factor into user requirements, making it vital to optimize power
when tested on ResNet-34 with a resolution of 320, where
consumption without compromising them. The mIoU, defined
the mIoU values for both compression rates are equal. And
as the ratio of the intersection to the union of two sets, one
the difference in mIoU between compression rates of 100 and
consisting of actual values and the other comprising predicted
20 is the smallest when the resolution is 128, the largest is
values, is an essential criterion for assessing the quality of
the resolution of 512. These findings underscore the impact
image segmentation. Specifically,
of video compression rate on the performance of semantic
segmentation tasks.
IoU = T P/(F P + F N + T P ), (1) Our preliminary experimental findings reveal that the video
resolution, depth of neural networks and video compression
where T P (True Positive) represents the intersection of the rate exhibit a complex influence on the energy consumption
true and predicted values, F P (False Positive) represents the of edge servers and the semantic segmentation accuracy.
instances where the model incorrectly predicted the positive The relationships between these key parameters exhibit un-
class, and F N (False Negative) represents the instances where predictable correlations. Therefore, determining the optimal
the model incorrectly predicted the negative class. Thus, F P + parameter configuration before the task is processed becomes
F N + T P represents the union of true and predicted values. challenging.
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
IV. P ROBLEM F ORMULATION the compression rate from a finite set K. So we define the
parameters selection on round t as:
Capture Image
Assignment
Parameter
zt = (ft , ht , kt ) ∈ Z ≡ F × H × K. (3)
Semantic
Set Requirement
Segmentation
B. System Energy Consumption
Display Results Collect Results
For edge servers, energy consumption mainly occurs during
802.11ac
User Devices
Access Point
Edge Servers the processing of tasks. Once the task parameters are assigned,
the tasks are adapted, and the processing commences. On the
Fig. 4. Intelligent edge computing framework for semantic segmentation completion of a task, the processed data is transmitted back to
the user over a wireless network. The computational energy
In this section, we present the details of our system and consumption is given directly by the edge server, and we use
outline the decision variables, system power consumption, and pcom (zt ) to represent the computational power consumption,
t
MAB problem models. For ease of reference, we summarize where zt represents the parameter that was selected in round
our commonly used notation in Table II. t. The function thus models the energy consumption when the
compression variable zt is applied to the round t. Aggregating
TABLE II the energy consumption across all round, we can define the
S UMMARY OF I MPORTATION N OTATIONS
total energy consumption due to computing as:
Notation Meaning T
T total round X
N a set of tasks pcom
total = pcom
t (zt ). (4)
Z a set of parameters t=1
P total energy consumption For data transmission, the energy consumption is calculated
F a set of video resolutions
H a set of neural network depth as the product of transmission time and transmission power.
K a set of compression rates The transmission time for a task is determined by the size
zt parameter selection at round t of the task and the transmission rate, represented as lt =
lt task transmission time at round t
dt task size at round t dt /Rtran . Here, dt represents the size of the task transferred at
Rtran uploading transmission rate round t, and Rtran represents the transmission rate. The trans-
S tran average signal power mission rate is derived from the Shannon-Hartley theorem:
otran transmission power
ptran tasks total transmission consumption Rtran = B log2 (1 + S tran /N tran ). Considering the channel
total
N tran power of the noise bandwidth B, the average received signal power S tran , and
pcom
total tasks total computing consumption the power of the noise N tran . The total energy consumption
λn minimum acceptable mIoU value from task n
during transmission is:
T
X
ptran
total = ptran
t , (5)
A. System Overview
t=1
We propose an energy-efficient semantic segmentation
where ptran
t = otran lt , with otran being the transmission
framework leveraging intelligent edge computing, as depicted
power when transferring tasks.
in Fig. 4. In our system, user (people or systems interacting
In sum, the total energy consumption of the system is a
with the edge devices or edge servers) devices capture video
combination of the energy consumed by edge servers during
frames (i.e., tasks) and transmit them to an edge server that
task processing and the energy utilized during data transmis-
performs semantic segmentation using Residual Networks. We
sion, mathematically represented as:
denote the number of tasks as n = 1, 2, . . . , N . We assume
that the user selects the minimum acceptable mIoU, denoted P = ptran com
total + ptotal . (6)
as λn , from a set of reference values that we provide. Each of
these reference values is derived from tests performed using C. Problem Formulation
specific datasets and parameters. The mIoU is an important indicator for evaluating semantic
Upon receiving the task, the edge server selects potential segmentation. We denoted the mIoU value when parameter zt
low-power parameters based on the mIoU requirements and is applied to task n by In (zt ), which depends on all variables.
prior decisions. Our system runs in round t, and we assign a And the user will give the minimum acceptable mIoU value
parameter to each round where t = 1, 2, . . . , T . As observed in λn before uploading the task. Then we define the constraints
the preliminary experimental section, three crucial parameters as follows:
influence the task’s power consumption and mIoU, and the In (zt ) − λn > 0, (7)
relationship between these parameters remains unknown. We
aim to minimize energy consumption while satisfying user- Our goal is to maximize P −1 while maintaining user mIoU
defined minimum mIoU constraints by deciding the video requirements.
resolution, neural network depth, and compression rate. Maximize P −1
Therefore, we select the video resolution from a finite set F, subject In (zt ) − λn > 0 (8)
choose the network depth from a finite set H, and determine P ̸= 0.
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
a
where µ1 , · · · , µ|Z| are the expected values of P1 , · · · , P|Z| for Sia ≥ 8∆ln2at , if X̄s∗ + cy,s ≤ X̄ia ,sia + cy,sia , then only at
i
and ∆i = µ∗ − µi . least one of the following two inequalities can hold (17) (18).
Then according to Chernoff-Hoeffding Bound, we can get the
Proof. The algorithm determines the sample set to be explored
following:
based on a threshold, where let Z a ∪ Z b ∪ Z c = Z, Z a ∩ Z b ∩
Z c = ∅, and let ta + tb + tc = t, where ta represents the P (x̄∗s ≤ µ(a)∗ − cy,s ) ≤ exp −2sc2y,s = y −4
total number of rounds that the algorithm has explored in the P µ(a)ia + cy,sia ≤ X̄ia ,sia ≤ exp −2sia c2y,sia = y −4 .
sample set Z a . Since the initialization of the algorithm will (19)
select each parameter once, until the t round. Therefore, let Therefore the regret after ta rounds is
Yiaa (ta ) denote the number of times when the ia -th parameter |Z a |
is selected, given by: ∗ a
X
Regtaa E Yja (ta )
= µ(a) t − µ(a)j
a
t
X j=1
Yiaa (ta ) = 1 + [Iy = ia ]+ , (13) X
= ∆ E [Yiaa (ta )]
ia (20)
y=|Z a |+1
ia :∆ia >0
where [.]+ is indicator function, Ita = ia indicates that the ia - |Z a |
ln ta π2 X
X
th parameter is selected at the t-th round. Let l be any positive ≤ 8 + 1+ ∆j .
∆ia 3 j=1
integer. The following inequality can be obtained ia :∆ia >0
t
X
a
Similarly we can get that Regtbb and Regtcc . The formula can
Yiaa (ta ) ≤l+ [Iy = i a
, Yiaa (y − 1) ≥ l]+ . (14) be simplified by factoring out common terms and representing
y=|Z a |+1 each sample set’s regret as a function. Therefore the total regret
p after t rounds is:
Let cy,s = (2 ln y)/s. Since the ia -th parameter is
selected in the y-th round, it means that the reward of this Reg T = Regtaa + Regtbb + Regtcc
!
parameter must be greater than or equal to the reward of the X ln ta X ln tb X ln tc
optimal parameter. Here we define X̄Y∗ ∗ (y−1) as the average ≤8 + +
a ∆ia ∆ib ∆ic
reward of the optimal parameter up until the (y − 1)-th round i :∆i >0
a b
i :∆ib >0 ic :∆ic >0
t a |Z|
X h X ln T π2 X
Yia (ta ) ≤l+ cy−1,Y ∗ (y−1) + X̄Y∗ ∗ (y−1) ≤ ≤8 + 1+ ∆j . (21)
(15) ∆i 3 j=1
y=|Z a |+1 i:∆i >0
Exhausting Y ∗ (y − 1) and Yia (y − 1), the following inequality VI. P ERFORMANCE E VALUATION
can be obtained:
In this section, we carry out extensive simulation experi-
Xta
ments to demonstrate the effectiveness of the proposed UCB-
a a
Yia (t ) ≤ l + min X̄s∗ + cy−1,s ≤ CoR algorithm in addressing the parameter selection problem.
1<s<y
y=|Z a |+1 (16) Furthermore, we verify the robustness of the suggested UCB-
CoR algorithm by performing experiments to assess its sensi-
max X̄ia ,sia + cy−1,sia .
l≤sia ≤y +
tivity to several key parameters.
In the first ta −1 rounds, any parameter is chosen at most y−1
times, so s < y, and Yiaa (y − 1) ≥ l, so sai ≥ l. Expanding all A. Experimental Setup
the possibilities of s and sai , the following inequality can be In our setup, three pre-trained semantic segmentation mod-
obtained: els (using the VOC 2012 dataset) reside on the edge server.
∞ y−1
To accelerate the inference process of our models, our system
X y−1
employs TensorRT as an underlying optimization engine [31].
X X
Yiaa (ta ) ≤ l + X̄s∗ + cy,s ≤ X̄ia ,sia + cy,sia + .
y=1 s=1 sia =l
Moreover, we convert our models to the ONNX format to guar-
(17) antee seamless compatibility with TensorRT [32], facilitating
The following conclusions can be proved by the contradic- efficient and swift model execution within our edge computing
tory method: if X̄s∗ + cy,s ≤ X̄ia ,sia + cy,sia , then at least one framework.
of the following three inequalities can hold: We consider the video compression rate as K = {100, 20},
the video resolution sets F = {128, 256, 320, 480, 512}, and
X̄s∗ ≤ µ(a)∗ − cy,s depth of networks sets H = {18, 34, 50}, and utilize the
µ(a)ia + cy,sia ≤ X̄ia ,sia (18) respective measurements obtained from our testbed in Section
∗ III. We used the H.264 video compression standard, which
µ(a) ≤ µ(a) + 2cy,sia ,ia
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
TABLE III
20% 25%
EXPERIMENTAL SETUP 20%
15%
Correct Rate
Correct Rate
15%
Parameter value 10%
10%
confidence value δ 4 5%
5%
total parameters Z 60 0
0 250 500 750 1000
0
0 250 500 750 1000
sample set z a 20 Number of Iterations Number of Iterations
Correct Rate
Correct Rate
20%
threshold interval λc
10%
52-57 5%
10%
5%
transmission power otran 300mW 0 0
signal to noise ratio 30dB 0 250 500
Number of Iterations
750 1000 0 250 500
Number of Iterations
750 1000
bandwidth B 10MHz
(c) Threshold z b (d) Threshold z c
Fig. 6. Correct rate for UCB algorithm used in different threshold
100%
90%
60%
50%
after 40,000 iterations. We believe that as the number of
40%
iterations approaches infinity, the probability of the algorithm
30%
selecting the correct parameters becomes infinitely close to
20%
100%. This occurs because the confidence interval of the
10%
average return for each candidate narrows with increasing
trials, allowing for a more accurate determination of the return.
0
0 10000 20000 30000 40000 50000
Number of Iterations
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
F. Contrast Experiment
4500
15000
4000 We compare the performance of the UCB-CoR algorithm
Energy Consumption (J)
2000
son Sampling and Epsilon-Greedy algorithms, for our energy
7500
1500 consumption problem. Firstly, we used the same iteration
5000
1000
500
count, we assessed total energy consumption and total regret
550 600 650 700 750 800 850
Number of Iterations
900 950 1000 25 50 75 100 125 150
Number of Iterations
175 200 225 250 275 value.
(a) 500-1000 times (b) 25-275 times The total energy consumption comparison results are shown
in Fig. 9a. It can be observed that during the initial phase
Fig. 7. Fluctuation of the energy consumption value during different iterations of the iteration (0-10,000), the performance of the Epsilon-
greedy algorithm and the UCB-CoR algorithm appeared to be
relatively similar (roughly 123,000 Joules). However, as the
consumption data. We calculated the average, maximum, and
minimum values, as shown in Fig. 7a. The steadily rising mean
value indicates the algorithm’s robustness. However, as the 12
10 5
8
10 5
Thompson-sampling Thompson-sampling
number of iterations increases, the fluctuation in total power 10 Epsilon-greedy
UCB-CoR
7
6
Epsilon-greedy
UCB-CoR
Total Regret
6 4
3
4
104 104 2
14 15
2
1
12
0 0
Energy Consumption (J)
6
(a) Energy Consumption (b) Regret
5
4
Fig. 9. Comparison of the total power consumption and regret
2
0 0
1 MHz 5 MHz 10 MHz 20 MHz 40 MHz 15 FPS 30 FPS 60 FPS 90 FPS
Bandwidth Frame Rate
number of iterations increased (10,000-30,000), the advantage
(a) Channel Bandwidth (b) Video Frame Rate of the UCB-CoR algorithm became increasingly evident. At
30,000 iterations, the total energy produced by Epsilon-greedy
Fig. 8. Fluctuation of the energy consumption value during different iterations
is roughly 422,000 Joules, but UCB-CoR is roughly 396,000
We also reduced the number of iterations to 300 and Joules. This observation suggests that the UCB algorithm is
recorded total power consumption every 25 rounds. The re- more effective in achieving energy efficiency than the Epsilon-
sults, shown in Fig. 7b, reveal that the curve does not rise greedy algorithm in MAB problems, especially in scenarios
smoothly. This is due to the exploration process in the early where the number of iterations is high.
stages of iteration, during which non-optimal configuration The comparison results of the total regret are shown in Fig.
parameters are selected, causing total power consumption to 9b. We find that the situation is similar to Fig. 9a, in the initial
be unstable. phase of the iteration, the Epsilon-greedy algorithm exhibits a
comparable level of performance as the UCB-CoR algorithm.
And as the number of iterations increased, the advantage of
E. Impact of Channel Bandwidth and Video Frame Rate the UCB-CoR algorithm became clear. To provide a better
In this subsection, we examine the influence of channel evaluation of this advantage, we increased the number of iter-
bandwidth and video frame rate on power consumption when ations to 100,000. Fig. 10a. clearly shows that the Thompson
using the UCB-CoR algorithm. sampling algorithm cannot maintain performance over a long
Firstly, we evaluate the effect of channel bandwidth on the period. The experimental result in Fig. 10b. reveals significant
system’s power consumption. We performed 10,000 iterations insights into the comparative performance of the UCB-CoR
of the UCB-CoR algorithm under five channel bandwidths: algorithm and the epsilon-greedy algorithm in a multi-iteration
1MHz, 5MHz, 10MHz, 20MHz, and 40MHz. The results, scenario. Specifically, the total regret value of the UCB-CoR
as shown in Fig. 8a, show that as the channel bandwidth algorithm is observed to be substantially lower than that of the
increases, the energy consumption of the system decreases. epsilon-greedy algorithm at both 30,000 to 100,000 iterations.
However, the power consumption does not change much for At 30,000 iterations, the UCB-CoR algorithm achieves a
channel bandwidths of 10MHz, 20MHz, and 40MHz, suggest- total regret value that is 67% of that of the epsilon-greedy
ing that there is a limit to how much power consumption can algorithm. At 100,000 iterations, the total regret value of UCB-
be reduced by increasing the bandwidth. CoR further improves to only 44% of that of the epsilon-
Secondly, we evaluate the impact of video frame rate on the greedy algorithm.
system’s power consumption. We conduct 10,000 iterations of On the other hand, the experimental results show that
the UCB-CoR algorithm under four video frame rates: 15FPS, the total energy consumption and total regret value of the
30FPS, 60FPS, and 90FPS. The results, as depicted in Fig. 8b, Thompson sampling algorithm are significantly higher than
demonstrate that as the video frame rate increases, the power those of the UCB-CoR and Epsilon-greedy algorithms. The
consumption of the system also increases. underperformance of the Thompson sampling algorithm could
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
be attributed to its underlying probabilistic model. The Thomp- consumption. We formulated a problem concerning overall en-
son sampling algorithm assumes that the reward distribution ergy consumption and accuracy constraints, casting it into the
of each arm is stationary and independent. In a dynamic Multi-Armed Bandit (MAB) framework and solving it using
environment, such as the one in our system, this assumption the Upper Confidence Bound (UCB) algorithm. We proposed
does not hold true. Furthermore, the algorithm’s reliance the UCB-CoR algorithm to address multi-constraint problems,
on probabilistic sampling leads to inefficient exploration- improving efficiency in semantic segmentation scenarios. Nu-
exploitation trade-offs, resulting in suboptimal performance in merous simulation experiments demonstrated that the UCB-
our experimental setting. CoR algorithm effectively minimizes energy consumption in
multi-constrained situations and offers clear advantages over
3.5
106
5
104
other algorithms. In future work, we plan to consider addi-
3
UCB-CoR
Thompson Sampling
4.5 UCB-CoR
Epsilon Greedy
tional parameters, such as multi-task computing on the same
4
2.5 3.5 edge node, and enhance our approach by incorporating edge
Total Regret
Total Regret
2
3
2.5
learning into our research.
1.5
2
1 1.5
0.5
1 ACKNOWLEDGMENT
0.5
0
30000 40000 50000 60000 70000 80000 90000 100000
0
30000 40000 50000 60000 70000 80000 90000 100000
This work is partially supported by JSPS KAKENHI
Number of Iterations Number of Iterations
Grant Numbers JP19K20250, JP20H04174, JP20K11784,
(a) UCB-CoR and Thompson Sam- (b) UCB-CoR and Epsilon Greedy
pling
JP22K11989, JP23K11063, Leading Initiative for Excellent
Young Researchers (LEADER), MEXT, and JST, PRESTO
Fig. 10. Comparison of UCB-CoR and Thompson Sampling and Epsilon Grant Number JPMJPR21P3, Japan. He Li is the correspond-
Greedy
ing author.
Finally, we contrast the UCB algorithm with our proposed
R EFERENCES
UCB-CoR algorithm. As presented in Fig. 11, when the
iteration of 1,000, the power expenditure of UCB-CoR with [1] C. Kim and J.-N. Hwang, “Object-based video abstraction for video
surveillance systems,” IEEE Transactions on Circuits and Systems for
λ = 5 remains below that of both the UCB algorithm and Video Technology, vol. 12, no. 12, pp. 1128–1138, 2002.
UCB-CoR with λ = 3. However, the differential between these [2] R. Yang and Y. Yu, “Artificial convolutional neural network in object
measurements tends to diminish with an increasing number of detection and semantic segmentation for medical imaging analysis,”
Frontiers in Oncology, vol. 11, p. 638182, 2021.
iterations. This is primarily due to the fact that having more [3] V. Singh and A. K. Misra, “Detection of plant leaf diseases using image
sample sets shortens the exploration time of the algorithm segmentation and soft computing techniques,” Information Processing
during the initial phase, and this gap becomes smaller as the in Agriculture, vol. 4, no. 1, pp. 41–49, 2017.
[4] S. Zhu, K. Ota, and M. Dong, “Green AI for IIoT: Energy efficient
optimal arm is identified. Therefore, more sample sets do not intelligent edge computing for industrial internet of things,” IEEE
bring in a more significant reduction in energy consumption. Transactions on Green Communications and Networking, vol. 6, no. 1,
pp. 79–88, 2021.
[5] T. Dreibholz and S. Mazumdar, “Towards a lightweight task scheduling
104 framework for cloud and edge platform,” Internet of Things, vol. 21, p.
12
100651, 2023.
UCB
[6] J. Xu, K. Ota, and M. Dong, “Energy efficient hybrid edge caching
10
scheme for tactile internet in 5g,” IEEE Transactions on Green Com-
munications and Networking, vol. 3, no. 2, pp. 483–493, 2019.
Energy Consumption (J)
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.
This article has been accepted for publication in IEEE Transactions on Green Communications and Networking. This is the author's version which has not been fully edited and
content may change prior to final publication. Citation information: DOI 10.1109/TGCN.2023.3321113
[14] A. Galanopoulos, J. A. Ayala-Romero, D. J. Leith, and G. Iosifidis, Xingyu Yuan received the B.Eng. degree in Com-
“Automl for video analytics with edge computing,” in IEEE INFOCOM puter Science from Tianjin University of Science
2021-IEEE Conference on Computer Communications, pp. 1–10, IEEE, and Technology, China, 2019, and M.Eng. degree in
2021. Muroran Institute of Technology, Japan, 2023. He
[15] J. Ji, K. Zhu, C. Yi, and D. Niyato, “Energy consumption minimization is currently pursuing the Ph.D. degree in Electrical
in uav-assisted mobile-edge computing systems: Joint resource alloca- Engineering at Muroran Institute of Technology,
tion and trajectory design,” IEEE Internet of Things Journal, vol. 8, Japan.
no. 10, pp. 8570–8584, 2020.
[16] H. Tessier, V. Gripon, M. Léonardon, M. Arzel, D. Bertrand, and T. Han-
nagan, “Energy consumption analysis of pruned semantic segmentation
networks on an embedded gpu,” in International Conference on System-
Integrated Intelligence, pp. 553–563, Springer, 2023.
[17] A. Kalai and S. Vempala, “Efficient algorithms for online decision
problems,” Journal of Computer and System Sciences, vol. 71, no. 3,
pp. 291–307, 2005. He Li received the B.S., M.S. degrees in Computer
[18] H. Li, K. Ota, and M. Dong, “Learning iot in edge: Deep learning for Science and Engineering from Huazhong University
the internet of things with edge computing,” IEEE Network, vol. 32, of Science and Technology in 2007 and 2009, re-
no. 1, pp. 96–101, 2018. spectively, and Ph.D. degree in Computer Science
[19] V. Cardellini, V. De Nitto Personé, V. Di Valerio, F. Facchinei, V. Grassi, and Engineering from The University of Aizu in
F. Lo Presti, and V. Piccialli, “A game-theoretic approach to computation 2015. He is currently an Associate Professor with
offloading in mobile cloud computing,” Mathematical Programming, Department of Sciences and Informatics, Muroran
vol. 157, no. 2, pp. 421–449, 2016. Institute of Technology, Japan. In 2018, he is se-
[20] P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-time analysis of the lected as a Ministry of Education, Culture, Sports,
multiarmed bandit problem,” Machine Learning, vol. 47, pp. 235–256, Science and Technology (MEXT) Excellent Young
2002. Researcher. His research interests include IoT, edge
[21] S. Raza, S. Wang, M. Ahmed, M. R. Anwar, M. A. Mirza, and computing, cloud computing and software defined networking.
W. U. Khan, “Task offloading and resource allocation for iov using 5g
nr-v2x communication,” IEEE Internet of Things Journal, vol. 9, no. 13,
pp. 10397–10410, 2021.
[22] C. Battiloro, P. Di Lorenzo, M. Merluzzi, and S. Barbarossa, “Lyapunov-
based optimization of edge resources for energy-efficient adaptive fed-
erated learning,” IEEE Transactions on Green Communications and
Kaoru Ota was born in Aizu-Wakamatsu, Japan.
Networking, 2022.
She received M.S. degree in Computer Science
[23] G. Zhang, W. Zhang, Y. Cao, D. Li, and L. Wang, “Energy-delay
from Oklahoma State University, the USA in 2008,
tradeoff for dynamic offloading in mobile-edge computing system with
B.S. and Ph.D. degrees in Computer Science and
energy harvesting devices,” IEEE Transactions on Industrial Informatics,
Engineering from The University of Aizu, Japan in
vol. 14, no. 10, pp. 4642–4655, 2018.
2006, 2012, respectively. Kaoru is a Professor and
[24] M. Li, X. Zhou, T. Qiu, Q. Zhao, and K. Li, “Multi-relay assisted
Ministry of Education, Culture, Sports, Science and
computation offloading for multi-access edge computing systems with
Technology (MEXT) Excellent Young Researcher
energy harvesting,” IEEE Transactions on Vehicular Technology, vol. 70,
with the Department of Sciences and Informatics.
no. 10, pp. 10941–10956, 2021.
She is also the founding Director of Center for
[25] S. Wang, J. Dai, Z. Liang, K. Niu, Z. Si, C. Dong, X. Qin, and P. Zhang,
Computer Science (CCS) at Muroran Institute of
“Wireless deep video semantic transmission,” IEEE Journal on Selected
Technology, Japan. From March 2010 to March 2011, she was a visiting
Areas in Communications, vol. 41, no. 1, pp. 214–229, 2022.
scholar at the University of Waterloo, Canada. Also, she was a Japan Society
[26] Z. Yang, M. Chen, Z. Zhang, and C. Huang, “Energy efficient seman-
of the Promotion of Science (JSPS) research fellow at Tohoku University,
tic communication over wireless networks with rate splitting,” arXiv
Japan from April 2012 to April 2013. Kaoru is the recipient of IEEE
preprint arXiv:2301.01987, 2023.
TCSC Early Career Award 2017, The 13th IEEE ComSoc Asia-Pacific
[27] J. Kang, H. Du, Z. Li, Z. Xiong, S. Ma, D. Niyato, and Y. Li,
Young Researcher Award 2018, 2020 N2Women: Rising Stars in Computer
“Personalized saliency in task-oriented semantic communications: Image
Networking and Communications, 2020 KDDI Foundation Encouragement
transmission and performance analysis,” IEEE Journal on Selected Areas
Award, and 2021 IEEE Sapporo Young Professionals Best Researcher Award,
in Communications, vol. 41, no. 1, pp. 186–201, 2022.
The Young Scientists’ Award from MEXT in 2023. She is Clarivate Analytics
[28] A. Ahamad, C.-C. Sun, and W.-K. Kuo, “Quantized semantic segmen-
2019, 2021, 2022 Highly Cited Researcher (Web of Science) and is selected
tation deep architecture for deployment on an edge computing device
as JST-PRESTO researcher in 2021, Fellow of EAJ in 2022.
for image segmentation,” Electronics, vol. 11, no. 21, p. 3561, 2022.
[29] L. Bai, Y. Zhao, and X. Huang, “A near sensor edge computing system
for point cloud semantic segmentation,” in 2022 IEEE International
Symposium on Circuits and Systems (ISCAS), pp. 1818–1822, IEEE,
2022.
[30] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image Mianxiong Dong received B.S., M.S. and Ph.D.
recognition,” in Proceedings of the IEEE Conference on Computer Vision in Computer Science and Engineering from The
and Pattern Recognition, pp. 770–778, 2016. University of Aizu, Japan. He is the Vice President
[31] Y. Zhou and K. Yang, “Exploring tensorrt to improve real-time in- and Professor of Muroran Institute of Technology,
ference for deep learning,” in 2022 IEEE 24th Int Conf on High Japan. He was a JSPS Research Fellow with School
Performance Computing & Communications; 8th Int Conf on Data of Computer Science and Engineering, The Uni-
Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on versity of Aizu, Japan and was a visiting scholar
Dependability in Sensor, Cloud & Big Data Systems & Application with BBCR group at the University of Waterloo,
(HPCC/DSS/SmartCity/DependSys), pp. 2011–2018, IEEE, 2022. Canada supported by JSPS Excellent Young Re-
[32] J. Bai, F. Lu, K. Zhang, et al., “Onnx: Open neural network exchange.” searcher Overseas Visit Program from April 2010 to
[Online]. Available: https://github.com/onnx/onnx, 2019. August 2011. Dr. Dong was selected as a Foreigner
Research Fellow (a total of 3 recipients all over Japan) by NEC C&C
Foundation in 2011. He is the recipient of The 12th IEEE ComSoc Asia-
Pacific Young Researcher Award 2017, Funai Research Award 2018, NISTEP
Researcher 2018 (one of only 11 people in Japan) in recognition of significant
contributions in science and technology, The Young Scientists’ Award from
MEXT in 2021, SUEMATSU-Yasuharu Award from IEICE in 2021, IEEE
TCSC Middle Career Award in 2021. He is Clarivate Analytics 2019, 2021,
2022 Highly Cited Researcher (Web of Science) and Foreign Fellow of EAJ.
Authorized licensed use limited to: BIRLA INSTITUTE OF TECHNOLOGY AND SCIENCE. Downloaded on December 27,2023 at 05:11:45 UTC from IEEE Xplore. Restrictions apply.
© 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See https://www.ieee.org/publications/rights/index.html for more information.