A Supervised Learning Approach For Differential Entropy Feature-Based Spectrum Sensing
A Supervised Learning Approach For Differential Entropy Feature-Based Spectrum Sensing
Department of Electronics and Communication Engineering, PES University, Bengaluru 560085, India
1 [email protected], 2 [email protected],
3 [email protected], 4 [email protected]
Abstract—In this work, we consider a supervised machine receiver uncertainty issues [3]. Hence, the performance of the
learning-based approach for spectrum sensing in cognitive radios. detection algorithm that analyses the spectrum at the receiver
The noise process is assumed to follow a generalized Gaussian is of critical importance.
distribution, which is of practical relevance. For classification, we
consider the differential entropy estimate in the received observa- Several methods have been proposed in the literature for
tions as a feature vector. For our comparative study, we consider performing SS. The popular methods include energy detector
the support vector machine, K-nearest neighbor, random forest (ED), matched filter detector (MFD), and cyclostationary fea-
and logistic regression techniques. Through experimental results ture detector [4]. In ED, the received signal energy is compared
based on real-world captured datasets, we show that the proposed against a threshold to detect PU activity in a particular fre-
differential entropy feature-based technique outperforms the
energy-based approach in terms of probability of detection. The quency band. ED is the most popular and simplest detector and
proposed technique is particularly useful under low signal-to- hence has been employed widely [5]. The MFD correlates a
noise ratio conditions, and when the noise distribution has heavier known transmitting signal with the received signal to detect the
tails. presence of a PU. Cyclostationary feature detector leverages
Index Terms—Cognitive radios, differential entropy, gener- the second order cyclostationary properties of the PU signal to
alized Gaussian noise, spectrum sensing, supervised learning detect PU activity. However, there are several limitations with
algorithms. these methods, such as the performance of ED is inferior at low
SNR, cyclostationary detector is computationally complex, and
the MFD requires the a priori information about the primary
I. INTRODUCTION
user signal and further requires synchronization between SU
With the evolution of 5G networks and the advent of the and PU [6]. The performance of these detectors have been
internet-of-things, more devices are becoming a part of the evaluated under known or assumed noise conditions. Additive
internet and wireless communications. To accommodate these white Gaussian noise (AWGN) is often chosen to describe
devices, the requirement of bandwidth – an expensive and the noise due to its simplicity for analysis. However, in many
limited commodity – has also grown significantly. Therefore, scenarios, it is observed that the effect of interference and
an efficient usage of bandwidth is inevitable, which saves noise follows a generalized Gaussian distribution (GGD), e.g.,
cost and increases the number of devices that can transmit ultra-wide band communication systems [7–9]. Moreover, a
information. Typically, spectrum is available both as licensed knowledge of the primary signal, the noise distribution and
bands and unlicensed communication bands. The licensed the fading statistics of the channel is often limited in a real-
spectrum is controlled by the owner, or a primary user (PU) of world scenario.
that spectrum. Cognitive radios (CR) are smart devices capable To address the issues of the above classical signal process-
of using a licensed spectrum when the PU is not active on its ing methods, we rely on supervised machine learning (ML)
licensed band. Therefore, a CR is also called as a secondary algorithms for SS. ML-based SS also has received research
user (SU), since it is only allowed to use the spectrum when attention in the recent past [10–13]. However, most of these
the PU is inactive [1]. methods discussed in the literature suffer from two major
As a secondary user, the cognitive radio first needs to disadvantages. First, these techniques utilise the sample energy
perform spectrum sensing (SS) to gain access to a channel. static in the received observations to train their respective
SS is a vital function because it ensures that the CR does architectures, which is known to offer a poor performance
not cause interference to the licensed user of the spectrum. under low SNR and in the presence of GGD noise [14].
SS is framed as a detection problem in which the detector Second and more importantly, the performance of these works
picks one of the two hypothesis - null hypothesis and alternate are evaluated on a synthetic or artificially created datasets.
hypothesis, where the null hypothesis represents the case in Therefore, the validity of performance of the proposed models
which only noise is present and alternate hypothesis represents in a real scenario are yet to be established. However, there
the case in which the signal and the noise are present together are works that have utilized features other than energy for
[2]. The performance of a detector for this problem is often training the ML model. A low-dimensional probability vector
hindered by noise statistics, multi-path fading, shadowing and is proposed in [15] to reduce the training time. In [16], a
978-1-6654-4086-8/21/$31.00 2021
c IEEE
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on January 02,2023 at 09:49:54 UTC from IEEE Xplore. Restrictions apply.
396
detector is designed based on a combination of both the energy To construct a feature for classification, we consider the dif-
vector and the Zhang’s test statistic [17] from the likelihood ferential entropy (DE) in the received observations under both
ratio as features to train an artificial neural network to perform hypothesis, which quantifies the average uncertainty in the
a hybrid spectrum sensing. underlying distributions. For a continuous random variable X
In this paper, we propose the use of the differential entropy over the support (−∞, ∞), with a probability density function
(DE) vector in the received signal to train some of the fX (x), the differential entropy, h(X), of X is defined as
popular supervised ML approaches for SS, and evaluate their ∞
performance on experimentally captured datasets. The DE h(X) = − fX (x) log(fX (x)) dx. (2)
statistic is a measure of information in the received obser- −∞
vations, and is known to perform better than ED [14]. We It should be noted that similar to the energy statistic, the DE
compare the performance of the DE-based supervised learning value is higher under the alternate hypothesis as compared to
approaches with ED-based counterparts and show that the the null hypothesis. The DE feature for samples generated
proposed method outperforms the latter under GGD noise. To from a GGD noise is given by its maximum likelihood
the best of our knowledge, utilization of DE as a feature to estimate, which can be shown to be [14]
train an ML model and perform SS on a real-world dataset M
1 β 1 β β
with GGD noise has not been studied earlier in the literature. h(Z) = − log + log |Zi − Z| ,
The main contributions of this paper can be summarized as β 2Γ( β1 ) β M i=1
follows: (3)
• We propose a supervised ML framework for SS in CR where β ∈ (0, 2] controls the tail of the GGD, Zi denotes the
under GGD noise, which is robust to unknown PU and ith received sample, i = 1, . . . , M , and Z denotes the sample
fading statistics. Additionally, GGD noise is a practical mean in the received observations, that is,
alternative model for AWGN.
M
• We employ the DE metric in the received observations to 1
Z = Zi . (4)
train the supervised ML architectures. M i=1
• We train and evaluate the performance of super-
vised learning models such as support vector machine, Given that the variance of the noise process is σn2 , the
k-nearest neighbor algorithms, random forest and logistic parameter α can be obtained as
regression on two experimentally captured datasets.
2 1
σn Γ( β )
• We compare the performance of the proposed techniques
α= . (5)
with the ED-based supervised ML approach and show Γ( β3 )
that the former outperforms the latter in both the datasets.
It is established that the well-known distributions namely,
The rest of this paper is organized as follows. Section II
Gaussian and Laplacian distributions are special cases of the
describes the system model. Section III describes the details
GGD distribution, for β = 2, and β = 1, respectively.
of the two experimentally-captured datasets that are used in
Moreover, it is interesting to note the DE feature given in
our experiments. Section IV discusses the machine learning
(3) reduces to the energy detector (ED) feature when β = 2.
algorithms used in our work, namely, support vector machines,
In other words, the DE feature-based classifier and the ED
K-nearest neighbor, random forest and logistic regression.
feature-based classifier should yield the same performance
Section V discusses the experimental results and comparison
when β = 2. To see this, consider the test which reject the
between the performances of the proposed technique and
noise-only hypothesis when [14]
the energy-based approaches. Finally, concluding remarks are
drawn in Section VI.
h(Z) > γ, (6)
II. SYSTEM MODEL where h(Z) is defined in (3) and γ is a suitably chosen
detection threshold. Substituting for β = 2 in (3) simplifies
Consider a CR node with M observations from a PU the condition as
transmitter operating in a particular frequency band. Under
M
the null hypothesis, we assume that the received samples are 1 1 1 2 2
− log + log |Zi − Z| > γ, (7)
independent and indentically distributed, and follow a gen- 2 Γ( 12 ) 2 M i=1
eralized Gaussian distribution (GGD) with a scale parameter
α > 0 and a shape parameter β ∈ (0, 2], whose probability which can be further simplified to get
density function fZ (z) is given by, M
1
|Zi |2 > γ , (8)
1 |z|β M i=1
fZ (z) = exp − , z ∈ R. (1)
2αΓ( β1 ) α
given that EZ = 0, which gives the decision statistic of the
Under the alternate hypothesis, the received signal statistics is energy detector [4].
assumed to be completely unknown, since it largely depends As mentioned earlier, most of the ML algorithms proposed
on the wireless environment and PU signal. It is worth noting earlier in the literature have carried out a performance study
that this knowledge is not required for the study in this paper. on artificially generated datasets, which may not yield a
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on January 02,2023 at 09:49:54 UTC from IEEE Xplore. Restrictions apply.
397
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on January 02,2023 at 09:49:54 UTC from IEEE Xplore. Restrictions apply.
398
1 1
DE =0.5 DE =0.5
DE =1 DE =1
0.8 ED =1 0.8
DE =2
Probability of detection
Probability of detection
ED =0.5
0.6 0.6
0.4 0.4
0.2 0.2
0 0
-30 -25 -20 -15 -10 -30 -25 -20 -15 -10
SNR (dB) SNR (dB)
(a) Dataset 1. (a) Dataset 1.
1 1
DE =0.5 DE =0.5
DE =1 DE =1
0.8 0.8
ED =1 DE =2
Probability of detection
Probability of detection
ED =0.5
0.6 0.6
0.4 0.4
0.2 0.2
0 0
-30 -25 -20 -15 -10 -30 -25 -20 -15 -10
SNR (dB) SNR (dB)
(b) Dataset 2. (b) Dataset 2.
Fig. 1. Performance comparison of DE-based and ED-based random forest Fig. 2. Performance comparison of DE-based random forest classifier, with
classifier, with β = 0.5 and 1, on (a) dataset 1 and (b) dataset 2. β = 0.5, 1 and 2, on (a) dataset 1 and (b) dataset 2.
increases.
Figure 2 shows the variation of probability of detection with SNR (dB)= -10
different SNR values for random forest classifier with different 0.6 SNR (dB)= -15
β values for dataset 1 – Figure 2a, and dataset 2 – Figure 2b. SNR (dB)= -20
The performance improves when β decreases, as expected.
Finally, the performance comparison of all the considered 0.4
supervised ML techniques is discussed in Tables I and II. The
TABLE I
0.2
COMPARISON OF PD VALUES (IN PERCENTAGE) ACROSS ALL SUPERVISED
0.5 1 1.5 2
LEARNING TECHNIQUES, FOR DIFFERENT VALUES OF β IN DATASET 1,
WITH PF = 0.1, SNR = −16 DB.
Fig. 3. Performance of DE based random forest classifier with varying β
β values.
Models 0.75 1 2
SVM 98.82 81.80 68.10 parameters are chosen as follows. The probability of false-
KNN 98.71 81.54 68.92 alarm is set to 0.1, a SNR is chosen to be −16 dB. Comparing
Logistic Regression 98.82 81.80 68.10 the probability of detection across ML models from Tables I
Random Forest 98.96 81.38 67.98
and II, it is observed that the PD values across all classifiers
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on January 02,2023 at 09:49:54 UTC from IEEE Xplore. Restrictions apply.
399
TABLE II [6] X. Zhang, R. Chai, and F. Gao, “Matched filter based spectrum sensing
COMPARISON OF PD VALUES (IN PERCENTAGE) ACROSS ALL SUPERVISED and power level detection for cognitive radio network,” in Proc. IEEE
LEARNING TECHNIQUES, FOR DIFFERENT VALUES OF β IN DATASET 2, Global Conference on Signal and Information Processing (GlobalSIP),
WITH PF = 0.1, SNR = −16 DB. Dec. 2014, pp. 1267–1270.
[7] S. J. Zahabi and A. A. Tadaion, “Local spectrum sensing in non-
Gaussian noise,” in Proc. International Conference on Telecommuni-
β cations, Apr. 2010, pp. 843–847.
Models 0.75 1 2 [8] Y. Chen and N. C. Beaulieu, “Novel low-complexity estimators for the
shape parameter of the generalized Gaussian distribution,” IEEE Trans.
SVM 99.27 83.26 67.89 Veh. Technol., vol. 58, no. 4, pp. 2067–2071, May 2009.
KNN 99.27 83.06 68.0 [9] Q. Z. Ahmed, K. Park, and M. Alouini, “Ultrawide bandwidth receiver
Logistic Regression 99.27 83.26 67.89 based on a multivariate generalized Gaussian distribution,” IEEE Trans.
Random Forest 99.95 83.63 69.19 Wireless Commun., vol. 14, no. 4, pp. 1800–1810, Apr. 2015.
[10] J. Tian, P. Cheng, Z. Chen, M. Li, H. Hu, Y. Li, and B. Vucetic,
“A machine learning-enabled spectrum sensing method for OFDM
are similar for a given β value. Overall, it was observed that systems,” IEEE Trans. Veh. Technol., vol. 68, no. 11, pp. 11 374–11 378,
random forest gave a slightly better PD values, for different Nov. 2019.
[11] Y. Arjoune and N. Kaabouch, “On spectrum sensing, a machine
β and SNR values, across both the datasets. learning method for cognitive radio systems,” in Proc. IEEE Interna-
tional Conference on Electro Information Technology (EIT), May 2019,
pp. 333–338.
VI. CONCLUSION [12] H. Qi, X. Zhang, and Y. Gao, “Channel energy statistics learning in
In this paper, we proposed the use of a differential entropy compressive spectrum sensing,” IEEE Trans. Wireless Commun., vol. 17,
no. 12, pp. 7910–7921, Dec. 2018.
vector as a feature to train different supervised machine [13] K. M. Thilina, K. W. Choi, N. Saquib, and E. Hossain, “Machine
learning algorithms for spectrum sensing in cognitive radios. learning techniques for cooperative spectrum sensing in cognitive radio
The noise process was assumed to follow a generalized Gaus- networks,” IEEE J. Sel. Areas Commun., vol. 31, no. 11, pp. 2209–2221,
Nov. 2013.
sian distribution with a shape parameter β. The algorithms [14] S. Gurugopinath, R. Muralishankar, and H. N. Shankar, “Differential
were run on two experimentally captured databases, and our entropy-driven spectrum sensing under generalized gaussian noise,”
experiments show that the proposed differential entropy feature IEEE Commun. Lett., vol. 20, no. 7, pp. 1321–1324, Jul. 2016.
[15] Y. Lu, P. Zhu, D. Wang, and M. Fattouche, “Machine learning techniques
performs better than the energy feature vector. with probability vector for cooperative spectrum sensing in cognitive
radio networks,” in Proc. IEEE Wireless Communications and Network-
REFERENCES ing Conference (WCNC), Apr. 2016, pp. 1–6.
[16] M. R. Vyas, D. K. Patel, and M. Lopez-Benitez, “Artificial neural
[1] T. Yucek and H. Arslan, “A survey of spectrum sensing algorithms for network based hybrid spectrum sensing scheme for cognitive radio,” in
cognitive radio applications,” IEEE Commun. Surveys Tuts., vol. 11, Proc. International Symposium on Personal, Indoor, and Mobile Radio
no. 1, pp. 116–130, Mar. 2009. Communications (PIMRC), Oct. 2017, pp. 1–7.
[2] S. J. Zahabi, A. A. Tadaion, and S. Aissa, “Neyman-Pearson cooperative [17] J. Zhang and Y. Wu, “Likelihood-ratio tests for normality,” Computa-
spectrum sensing for cognitive radio networks with fine quantization at tional Statistics & Data Analysis, vol. 49, no. 3, pp. 709–721, Jun.
local sensors,” IEEE Trans. Commun., vol. 60, no. 6, pp. 1511–1522, 2005.
Jun. 2012. [18] S. Gurugopinath and R. Muralishankar, “Geometric power detector for
[3] M. D. Lin, L. Fansheng, and S. Jie, “Simulation on multi-path fading spectrum sensing under symmetric alpha stable noise,” Electron. Lett.,
in wireless channel,” in Proc. International Conference on Computer vol. 54, pp. 1284–1286, Nov. 2018.
Science and Electronics Engineering, vol. 3, Mar. 2012, pp. 427–429. [19] B. M. Pati. Raw spectrum data. [Online]. Available: https://github.com/
[4] D. Bhargavi and C. R. Murthy, “Performance comparison of energy, bipungithub/spectrum_data
matched-filter and cyclostationarity-based spectrum sensing,” in Proc. [20] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt, and B. Scholkopf,
International Workshop on Signal Processing Advances in Wireless “Support vector machines,” IEEE Trans. Intell. Transp. Syst., vol. 13,
Communications (SPAWC), Jun. 2010, pp. 1–5. no. 4, pp. 18–28, Jul. 1998.
[5] A. Satheesh, Aswini S. H., Lekshmi S. G., S. Sagar, and Hareesh [21] S. A. Dudani, “The distance-weighted k-nearest-neighbor rule,” IEEE
Kumar M, “Spectrum sensing techniques a comparison between energy Trans. Syst., Man, Cybern., vol. SMC-6, no. 4, pp. 325–327, Apr. 1976.
detector and cyclostationarity detector,” in Proc. International Confer- [22] T. N. Ho, “The random subspace method for constructing decision
ence on Control Communication and Computing (ICCC), Dec. 2013, forests,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 8, pp.
pp. 388–393. 832–844, Aug. 1998.
Authorized licensed use limited to: NATIONAL INSTITUTE OF TECHNOLOGY WARANGAL. Downloaded on January 02,2023 at 09:49:54 UTC from IEEE Xplore. Restrictions apply.