IET Electronics Letters Template

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Title Goes Here...

below we are going to look at how the different custom cost functions
were applied to improve the accuracy, performance and predicting
Authors Names Goes Here.... capabilities of our model.

Abstract Goes Here


Intro Goes Here
Introduction:

Design Model and Technique: In order to achieve the research objectives


of this paper, we applied knowledge transfer, a technique that defines
the process of transferring knowledge (methods, techniques, algorithms
. . . ) from one domain to another which may not necessarily be related in
order to leverage the power of deep image inpainting for the completion
(extrapolation) of partial load-pull contour maps. Fig. 1. The Generative Learning Block (GLB) Model
The advancement in deep neural networks and the great research work
of Egbert et al [1] whose approach sets the foundation for the application
Figure 1 shows an overview of the training process of the GLB model.
of deep image completing techniques to partial load pull extrapolation
A brief description of the different training steps is as follows:
were the motivating factors for this research work.
Contrarily to our model, their model only focused on using a
1 Data Pre-processing Phase: Given that the measured data collected
gradient based image completion technique to train a generative
from ADS is in a comma separate format (CSV) and also has features
adversarial network (GAN) on known load-pull contours wherein they
with values either scaled differently or in a wider range (that is out
used Wasserstein GAN based on Gradient Penalty (WGAN-GP) [4].
of the required [−1, 1] boundary conditions for successfully training
Our model however proposes a better alternative by presenting a
deep neural networks), we applied the 2 step data pre-processing
more advanced and robust deep neural network model that is light
(by applying the Z-score standardization technique to standardize the
weighted, fast, performant and highly efficient in carrying out deep
dataset to zero mean (µ = 0) and unit variance (σ = 1) followed by the
image completion (extrapolation) tasks which we named the Generative
application of the Tanh or Sigmoidal (1) Normalization technique to
Learning Block (GLB) Model.
bring our dataset in the range (T anh : [−1, 1] and Sigmoid : [0, 1]).
The model is built on three main highly performant and efficient deep
image inpainting network models namely:
1  x  1 − e−x
• the Context Encoder (CE) [9] which laid the architectural foundations S(x) = or tanh = (1)
1 + e−x 2 1 + e−x
for many other follow-up papers in deep image inpainting,
• the PEPSI++ [11] model which provides a more robust, faster and x−µ
xstd = (2)
lightweight network for deep image inpainting (completion) mainly σ
design to solve the drawbacks and limitation of the coarse-to-fine We also performed post-processing on the outputs produced by our
network as well as the traditional Context Encoder by greatly saving model to project it back into its original domain which is important
computational resources and memory consumption hence providing a in visualizing results for better understanding. To achieve this, we
cost-effective way for future deep image completion tasks, simply performed the inverse transformation (Re-normalization) of
• the Generative Multi-Column Convolutional Neural Network the outputs as shown in the equations below.
(GMCNN) [12] that synthesizes different image components in a
parallel manner within one stage. xrenorm = 2tanh−1 (xnorm ) ∗ σ + µ

(3)
By harnessing the power of the deep image inpainting models listed 2 Dataset Masking Phase: Since we need to train our model to fill
above and several other deep neural network models and functional layers (complete) missing parts (regions) in the training dataset, we therefore
which we will briefly discuss below, we built a completely new variant need to feed our model with partially filled dataset. To achieve this, we
of the PEPSI++ [11] network that accurately and efficiently solves deep applied binary masking to the collected dataset as shown in equation 4
image completion tasks that scales to the domain of completing load- below (where xi represents a pixel (point) in our training dataset);
pull simulated contours in a completely new design fashion by using the (
terminology of projecting Load-points and their related FOMs channel 0 if xi is an unknown pixel (holes)
wise. M= (4)
1 if xi is a known (valid) pixel
Additionally, our network also uses the GAN-like architecture which
has both the Generator and the Discriminator of the Vanilla GAN [3], Notice from figure 1 that we used element-wise Random Masking,
however our model leverages the power of an Encoder-Decoder style to randomly mask our dataset channel-wise so as to give our neural
network (proposed in the traditional Context Encoder [9]) in its network model more flexibility in locating and filling/completing
generator . It also employs a 2-typed adversarial model networks (the missing regions found in the masked dataset.
joint global and local discriminator networks and an advanced variant 3 Input Conditioning Generator and Discriminator Phase: Inspired
of the PatchGAN-discriminator model called the Region Ensemble by the great research work of Mehdi Mirza and Simon Osindero
Discriminator (RED) [5]) . [8] who proposed a novel way of training Generative models called
Also, the implementation of several jointly functional network models the “Conditional GAN (CGAN)”, a new variant of vanilla GAN,
like the U-Net Model) [10], Self-Attention Model of the SAGAN (Self constructed by feeding the conditional data to the generator and the
Attention GAN) [14]), Modified Contextual Attention Module (MCAM) discriminator network. We therefore applied CGAN to our model to
from PEPSI++) [11] and Conditional GAN (CGAN)) [8], complement learn the structured loss, which penalizes our model based on the
each other to efficiently and accurately perform image completion tasks joint configuration (concatenation) of the outputs from Generator and
in a wide domain of applications not limiting to images. Discriminator with the original input as shown in the equation below:

System Architecture: The Figure 1 below presents the general


architecture of the Generative Learning Block (GLB) which shows G(X, Y ) 7−→ Ŷ ; where X, Y, Ŷ ∈ {xijk ; yijk ; ŷijk . . .}H×W ×N
the adversarial training process of the Generator and the Discriminator 
D Ŷ , Y 7−→ {0 : f ake; 1 : real}
networks. it also presents the different cost functions (objective functions)
used to measure "how good" the neural network performed with respect NB: X represents the partially filled map, Ŷ represents the
to its given training samples and the expected outputs. In the subsection predicted (generated) map from the generator and Y the observed
image with the full (complete) Map. G and D represents the Generator
and Discriminator respectively. Also (H × W × N ) represents the

ELECTRONICS LETTERS 01st November 2021 Vol. 00 No. 00


Height, Width and number of channels respectively and (H × W ×) • Hinge Loss [6]
defines the window of our map. However, we used the lower case
letters (i, j, k) to represent the exact position of a pixel in the defined LG = −Ez∼pz ,y∼pdata [D(G(z), y)] (9)
window (map).
4 The Generator Model: Our generator model modifies the single
generative network of PEPSI++ made up of a single shared encoding LD = −E(x,y)∼pdata [min(0, −1 + D(x, y))]
network and a parallel decoding network called “the coarse and
inpainting paths” by adding some extra layers and models such as the − Ez∼pz ,y∼pdata [min(0, −1 − D(G(z), y))] (10)
Self Attention layer, the U-net model, Instance Normalization [13] • Least Squared Loss [7]
layer, Up-sampling and down-sampling layers at the input and
output together with some extra convolutional layers. We also 1  1
Ex∼pdata (x) (D(x) − b)2 + Ez∼pz (z) (D(G(z)) − a)2
  
minL(D) =
modified the decoding network to use a transposed Convolutional D 2 2
layer (ConvTranspose2D) with a Concatenation layer rather than the 1
Ez∼pz (z) (D(G(z)) − c)2
 
proposed Convolution Layer + Up-sampling used in PEPSI++, which minL(G) = (11)
G 2
allowed us to use the U-net-like architecture in our decoding network,
hence adding an extra functionality into our model since it provides where a and b are the labels for fake data and real data respectively
skip connections between mirrored layers in the encoder and decoder and c denotes the value that G wants D to believe for fake data. Mao
stacks.The modified decoding networks helped to solve the loss of et al. [7] also proposed the parametric selection that works best in
border pixels in every convolution which contributed in amplifying most GAN training, one of which requires minimizing the Pearson χ
the prediction capabilities of our Generator model. divergence. They therefore proposed choosing the values of a, b, and c
5 The Discriminator Models: Here we also trained our model using the that satisfies the conditions: b − c = 1 and b − a = 2.
Global and Local Discriminators used in [9] and the Region Ensemble
Discriminator (RED) proposed in [5]. We later compared the results Data Collection and Training: The data used in this project was
from both models to see how each impact the training process in terms generated using Keysight’s Path wave Advanced Design System (ADS)
of the performance and its ability to accurately distinguish between by simulating load-pull contour maps of the FOMs (output power, Power
real and fake generated load-pull contour maps as well as its capacity added efficiency and Gain) using the one tone load-pull simulation design
of penalizing the generator to improved its prediction capabilities. guide called HB1Tone-Loadpull (provided by Keysight Technologies).
6 Cost or Loss Functions: From figure 1, We can see the different We collected 320 CSV data files where 1csv file = 100 simulated load-
cost functions applied to our GLB model. The Cost (Loss) function is points with their corresponding figure of merits. We further split the
the key feature in any machine learning and deep learning networks dataset into training dataset (93.75% = 300 train samples) and testing
training, as it is the measure of the error between the values dataset (6.25% = 20 test samples). The FOMs of the resulting datasets
predicted by our model and the actual (real) expected values. These were pre-processed by standardizing to zero mean and unit variance
errors are then backpropagated to update the weights of the model and applying sigmoidal or tanh normalization with a hyperbolic tangent
through a process called “backpropagation”. We used the following function as recommended by [1].
cost functions to train our GLB model; Confidence-Driven (Pixel- We also reshaped the data before it is fed to the deep
Wise) Reconstruction Loss [12], Wassertein GAN Gradient Penalty neural network models by projecting both the reflection
(WGAN-GP) [4], Hinge version of the Adversarial Loss inspired coefficient and the FOMs channel-wise into 5 channels:
by [6], Least Squared Loss [7], and Wasserstein Loss [2]. In addition C = 5, (Ximag , Xreal , Pout , P AE, GAIN ). The five channels
to these losses, we also implemented two custom loss functions represent a single pixel location (Pijk ) and (N, C) −→ (N, H, W, C)
called (Coarse and Inpainting path Loss) inspired by PEPSI++ [11] to where N represents the number of training/testing samples, (H × W )
optimize both the coarse and inpainting path of our Generator model. represents the defined window and C the number of channels. In the
training phase, we optimized the cost by controlling the importance of
Cost (Objective) Functions: each loss term with a λ-hyperparameter as shown in the table below.
Another key optimization term is the learning rate (lr). To control the
• Confidence-Driven (Pixel-wise) Reconstruction Loss [12] learning rate (lr) of the Generator and Discriminator models, we used
(
i−1
Adam optimizer which is a stochastic gradient-based optimization
i i M̄ i = 1 − M − Mw method that has shown great success in the domain of deep learning
Mw =(g ∗ M̄ ) ⊙ M where : 0 =
(5)
Mw 0. especially when optimizing GANs networks;
From equation (5) a gaussian filter g is used to convolve M by linp
Table 1: Training Configurations for GLB model (λinp = , λcp =
propagating the confidence of known pixels to unknown ones for lcp k lrc
N
several iterations in order to generate a weighted loss mask Mw . The N
(1 − kmax
), λrc = N
), N : batch-size, (li, lcp, lrc): loss terms.
final reconstruction loss is; model/params loss λhp lr kg
Generator NA NA NA NA
Lc = || (Y − G([X, M ]; θ) ⊙ Mw ||1 (6) Global Discriminator NA NA NA NA
Local Discriminator NA NA NA NA
where G([X, M ]; θ) is the output of our generator model G, g and RED NA NA NA NA
θ denotes the learn-able parameter. To propagate the confidence of
known pixels to unknown ones, we use a Gaussian filter g to convolve NB: As k −→ kmax , λcp −→ 0 thereby reducing the influence of λcp
M to create a weighted loss mask Mw as; on the inpainting path given that the models has learned to accurately
• Coarse and Inpainting Path Loss [11] reconstruct partial load-pull maps.

Inpainting Path: Linp =


N
N
λi X
X (n) − Y (n)
i
n=1

1
(7) Text Goes Here
Results and Discussion:

Coarse Path: LC =
λc

1−
k
N
X
(n)
Xc − Y (n)

(8)
Conclusion: Text Goes Here
N kmax 1
n=1
Acknowledgment: This work has been supported by The IET
(n) (n)
where Xi , Xc and Y (n) are the nth map pair of the generated
map via the inpainting and coarse path in a mini-batch respectively. References
λi and λc are the hyper-parameters controlling the contributions from
[1] Austin Egbert et al. “Partial Load-Pull Extrapolation Using Deep
each loss term, and k and kmax represents the number of iterations
Image Completion”. In: 2020 IEEE Texas Symposium on Wireless
of the learning process and the maximum number of iterations,
and Microwave Circuits and Systems (WMCS) (2020), pp. 1–5.
respectively.

2
[2] Charlie Frogner et al. Learning with a Wasserstein Loss. 2015.
arXiv: 1506.05439 [cs.LG].
[3] Ian Goodfellow et al. “Generative Adversarial Nets”. In: ArXiv
(June 2014).
[4] Ishaan Gulrajani et al. “Improved Training of Wasserstein
GANs”. In: (Mar. 2017).
[5] Hengkai Guo et al. “Region ensemble network: Improving
convolutional network for hand pose estimation”. In: 2017 IEEE
International Conference on Image Processing (ICIP) (2017).
DOI : 10 . 1109 / icip . 2017 . 8297136. URL : http : / /
dx.doi.org/10.1109/ICIP.2017.8297136.
[6] Jae Hyun Lim and Jong Chul Ye. Geometric GAN. 2017. arXiv:
1705.02894 [stat.ML].
[7] Xudong Mao et al. Least Squares Generative Adversarial
Networks. 2017. arXiv: 1611.04076 [cs.CV].
[8] Mehdi Mirza and Simon Osindero. Conditional Generative
Adversarial Nets. 2014. arXiv: 1411.1784 [cs.LG].
[9] Deepak Pathak et al. “Context Encoders: Feature Learning by
Inpainting”. In: 2016 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR) (2016), pp. 2536–2544.
[10] Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net:
Convolutional Networks for Biomedical Image Segmentation.
2015. arXiv: 1505.04597 [cs.CV].
[11] Y. Shin et al. “PEPSI++: Fast and Lightweight Network for
Image Inpainting”. In: IEEE Transactions on Neural Networks
and Learning Systems 32 (2021), pp. 252–265.
[12] Yi Wang et al. “Image Inpainting via Generative Multi-column
Convolutional Neural Networks”. In: NeurIPS. 2018.
[13] Yuxin Wu and Kaiming He. Group Normalization. 2018. arXiv:
1803.08494 [cs.CV].

You might also like