Future Generation Computer Systems: Anil Carie Mingchu Li Chang Liu Prakasha Reddy Waseef Jamal
Future Generation Computer Systems: Anil Carie Mingchu Li Chang Liu Prakasha Reddy Waseef Jamal
Future Generation Computer Systems: Anil Carie Mingchu Li Chang Liu Prakasha Reddy Waseef Jamal
highlights
article info a b s t r a c t
Article history: In this paper, we investigate the Hybrid Directional CR-MAC based on Q-Learning with Directional
Received 8 August 2017 Power Control in cognitive radio (CR) systems. In CR systems, nodes can switch to heterogeneous non-
Received in revised form 20 October 2017 overlapping channels opportunistically which offer higher achievable throughput. However, the random
Accepted 7 November 2017
channel selection policy in existing CR-MAC protocol has problems like delay, packet collisions, and
Available online xxxx
quality of service. The proposed channel selection scheme which is quite different from the traditional
Keywords: scheme is adopted by nodes to achieve context awareness and intelligence for adaptive channel selection.
Software defined cognitive radio network The nodes select a channel based on the results learned by interactions with the other nodes and channels.
MAC protocol The directional transmission power control scheme allows the nodes to reuse the channels subject
Directional antennas to interference constraints. The simulation results show that nodes using the proposed algorithm can
Software defined radio select channels adaptively and optimal transmission power which helps to achieve high throughput and
Q-learning minimized power consumption.
Power control © 2017 Elsevier B.V. All rights reserved.
https://doi.org/10.1016/j.future.2017.11.014
0167-739X/© 2017 Elsevier B.V. All rights reserved.
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.
2 A. Carie et al. / Future Generation Computer Systems ( ) –
channel impairments. However, it is not suitable for primary user’s of wireless network using directional antennas is studied in [35].
QoS if secondary users transmit with arbitrarily high power. Thus In [36], the achievable throughput of mobile ad hoc network with
it is natural that power control should rely on the interference the directional antenna is addressed. Directional Medium Access
levels. In this paper, we propose ‘‘Hybrid Directional CR-MAC based Control is studied in [37], it suffers from deafness and hidden
on Q-Learning with Directional Power Control’’ where learning terminal problems. First Power control with a directional antenna
algorithm is used for channel selection and a new Directional over packet radio network was considered in [38]. Power control
transmission power control scheme (DTPC) for enhancing the built on D-MAC with Directional RTS, Omni Directional CTS and
throughput and energy efficiency of the directional hybrid CR MAC optimal power for the data packet is studied in [39], increased
protocol. The main contributions of our work are twofold. First, network capacity and reduced power consumption. An optimiza-
using channel selection algorithm we try to select the best channel tion problem for selecting the range of channels for transmission
based on SU’s observation of PU’s traffic, channel characteristics with the control channel and aggregated data channel was em-
like throughput achieved, packets lost. Second we investigate the ployed in statistical channel allocation MAC (SCA-MAC) [40] which
problem of channel reuse in directional communication, where outperforms random scheme. DSA-MAC [41], DCRMAC [42], HC-
CR user has the power control capability. That is, CR user can MAC [43], DDMAC [44], SMA [45] are the protocols which are
transmit at any transmit power in the allowable power range that similar to IEEE802.11 DCF standard for reserving a channel.
to achieve maximum concurrent transmissions. The organization
of the rest of the paper is as follows. In Section 2, we present 3. System model
related work. In Section 3 we present system model. Overview of
the Q-learning algorithm is presented in Section 4. The proposed In this study, users with CR capabilities, referred to as sec-
Hybrid Directional CR-MAC based on Q-Learning with Directional ondary users, can communicate with other CR nodes utilizing the
Power Control Scheme is presented in Section 5. Simulation results primary networks available spectrum spatially and/or temporally.
for different network topologies are shown in the Section 6 to When the secondary network doesn’t have enough resources CR
establishes the substantial throughput and energy gains that can nodes form an ad-hoc network without a central controller or
be attained under the investigated scheme. Finally, we present our dedicated control channels. Due to highly dynamic and hetero-
conclusions and future work. geneous networking environment, a dedicated control channel
is not pre-defined for exchanging control messages. We assume
2. Related work that the nodes are close enough as to consider an interference-
limited spectrum sharing scenario in which a CRAHN operates.
Game theory based CR The system model is composed of M licensed channels which
In CR networks Game theory has recently been the most pop- are accessed opportunistically by K CR nodes (acts as both trans-
ular method for attaining context-awareness and intelligence. In mitter and receiver), D primary users. The primary transmitter,
which, SU’s interact to maximize their individual objective such primary receiver, and the mobile CR devices are distributed in
as delay, throughput etc. however, there are several limitations randomly within the coverage area. Similar to [46–48], the two-
in game theory which are addressed using RL approach. Firstly, state continuous-time Markov process is used to model the traffic
GT based CR requires a complete set of information to compute of each channel: Channel occupied by the PU (busy state) and the
the Nash equilibrium; hence it is more suitable for centralized CR channel that is not occupied by the PU(idle state). These two states
networks [15,16]. are referred as ON and OFF respectively. Each SU transmitter and its
Secondly, GT assumes a single type of objective function corresponding SU receiver, but also on the time-varying activities
throughout the CR network, and hence a homogeneous learning of the PUs. We consider the situation that several SUs may compete
mechanism in all the SUs. Thirdly, SU’s might converge to sub for the same channel, and one SU may have more than one channel
optimal action due to miss-coordination even when optimal ac- for selection
tion exists. Although the GT has been successfully applied in CR
networks [17–26], the RL approach is a good alternative which Antenna model
addresses the issues above associated with GT. For instance, the To predict the received power, as in [49,50], we consider a gen-
RL supports heterogeneous learning mechanism in each agent eral power propagation model Pr = Pt CG rt Gtr
dtrα
where Pt is the trans-
because each agent can represent distinctive performance metrics mit power, dtr is the distance between transmitter and receiver,
as local rewards. α is the path-loss exponent, Gtr and Grt are the directive gains of
the transmitting and receiving antennas toward the direction of
Omni directional Power control the receiving and transmitting antennas, respectively, while C is a
SU’s vary their transmit power depending on interference constant determined by other factors as antenna heights and wave
at primary receiver and maximum secondary transmit power length.
constraint [27]. Concurrent transmission region is maximized
in [6] using optimal power control. The number of concurrent Time slot structure
transmissions are maximized in [7] using dynamic spectrum shar- The system model has slotted transmission structure as shown
ing. With objectives of maximizing sum-rate, achieving rate fair- in Fig. 3 and described as follows Each secondary user executes
ness, minimizing power consumption using power control are following stages synchronously during each time slot.
studied in [28–32]. The necessary and sufficient condition for the
- Channel Sensing: SU’s sense the PU channel’s to detect the
feasible region using Power controlled MAC consisting of only
activity of PU’s.
two transmission links is derived in [33]. Channel hopping se-
- PCL-EXCHANGE: After Sensing SU’s broadcasts their Primary
quence is used to allocate the control channel to one-hop neighbor
user free Channel List (PCL) to its neighbors. After receiving
nodes [25]. The basic drawback of sequential CCC based CR-MAC is
PCL information from neighbors, SU’s update PCL table.
longer channel rendezvous delays [26–31]. Channel rendezvous is
- CHANNEL RESERVATION: it is divided into N slots and every
more challenging for increased availability of PU channels.
slot is divided in to two sub slots sub slot (S2) for a node to
Directional power control send RTS directionally and sub slot (S2) for the destination
For reusing spectrum in the macro cell, underlying microcell node to reply with CTS or DNAV.
uses Antenna beamforming and power allocation schemes to max- - DATA TRANFER: SU’s which successfully reserved channel in
imize multiuser sum rate [34]. Capacity and power consumption CHANNEL RERVATION PHASE start data transmission
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.
A. Carie et al. / Future Generation Computer Systems ( ) – 3
4. Overview of Q-learning algorithm the different nodes (states). Therefore, when choosing the next
channel, we let agent act greedily, taking, in each situation, the
We use Q-Learning [12], which is a recent form of reinforcement action with highest Q -value. If the node can transmit data suc-
learning algorithm that does not need a model of its environment cessfully using the channel, then reward R will be increased by the
and works by estimating the values of state–action pairs. The number of packets transmitted otherwise R will be 0. The discount
estimate of future reward value in Q- learning algorithm is given factor is an important parameter of the Q-Learning algorithm. We
by Q(S,A) when an agent takes a particular action A when in a use variable discount factor which is determined by (PU activity
particular state S. Learning intervals are denoted by t∈T = {1,2,. . . }, rate, number of SU competing for the channel, bandwidth).
a constant interval by tD , actions by a∈A, rewards by rt+1(at). Every
CHANNEL SELECTION USING DH-CRMAC
agent records then learnt values from the environment in Q-table
Set α ϵ [0, 1] //learning rate
for all its possible actions with | A| entries. The local reward for For initial time slot
an action is reflected through in its Q -value; hence agents change Select random channel
their actions when there is a change in Q -value. At each interval Broadcast PCL with chosen channel ID’s at the top of the list
t, agent i chooses an action at, and receives local reward rt+1 (at ) to other users.
Receive the information of other user’s channel selection.
at time t+1. The agent i updates the Q -value of at at time t+1 as Calculate population state of each channel (Count number of
follows: user selecting a given channel).
Choose the channel with least count.
Qti+1 (att ) ← (1 − α )Qti (ait ) + α rti+1 (ait ) (1) Sense and contend for the chosen channel and
Transmit data packets if successfully grabbing the channel.
where 0≤α≤1 is learning rate. Value of α decides the dependence If (receive ACK for the DATA packet sent)
on the reward, a higher value of α gives more importance to local Then ND = ND+1
reward than past knowledge. End if
End of initial time slot.
Agents search for action that maximizes value function Vπ as
shown below: For remaining time slot’s
rt+1 (at ) = (ND/TD) + population state
V π = max(Qti (a)). (2) Q t+1 (at ) = (1 − α ) Qt (at ) + α rt+1 at ).
a∈A R = uniform (0, 1) {generate random number}
By exploring the environment, the agents build a table of Q - If R<=ε then
a temp = uniform (1, k)
values for each environment state and each possible action. Ex- Else
ploitation chooses the best known action, or the greedy action, at a temp = Argmax a ∈ A (Q (a))
all times for performance enhancement. Exploration chooses the If | Q t+1 (a temp)− Qt+1 (at )| <= β then
other non-optimal actions once in a while to improve the estimates a t+1 = at
Else
of all the Q -values in order to realize better actions. The learning
a t+1 = atemp
rate and the discount factor are important parameters of the Q- End if
learning algorithm. The learning rate parameter limits how quickly End if
learning can occur. The discount factor controls the value placed on Return a t+1
future rewards. End if
Broadcast PCL with chosen channel ID’s at the top of the list other
users.
5. Hybrid directional CR-MAC based on Q-learning with direc- Receive the information of other user’s channel selection.
tional power control Calculate population state of each channel (Count number of the user
selecting a given channel).
Choose the channel with least count.
5.1. Channel selection using Q-learning Sense and contend for the chosen channel and
Transmit data packets if successfully grabbing the channel.
We present our directional antenna based hybrid CCC based End of time slot
CR-MAC with dynamic channel selection implementation in this
section. Each SU node is equipped with one transceiver which is
used for both control and data. 902MHz is used by nodes to ex- 5.2. Optimal directional power control
change their free channel list which is used to find common control
channel (CCC) among nodes. Nodes decide on PU free (available) In this section, the proposed optimal directional power control
data channel in the licensed band over CCC. Two-way handshaking algorithm for channel spatial reuse is presented. We first study
is performed by nodes to transmit control and data information. the feasibility of the channel reuse with proposed optimal power
An illustration of the cognitive MAC protocol is shown in Fig. 4. control algorithm. Fig. 2 illustrates a classical spectrum access or
Channel switching decision is made at the beginning of each time spectrum sharing scenario with D randomly distributed primary
slot which depends on the channel state information. users (PU in Fig. 2) and K secondary users (SU in Fig. 1). In this
In Directional Hybrid (DH) CR-MAC the DCS applies QL to select scenario, we assume that each of the PUs is equipped with Omni-
a channel. Each agent divides time in to fixed intervals of ‘t’ and antenna, while each of SUs is equipped with multi-antenna, which
keeps track of a number of packets transmitted successfully ‘ND ’. is available for beamforming technology. In this case, the small
At the beginning of each‘t’ every node updates Q -value using cell consists of PU broadcasting channels and SU beamforming
(1) and chooses the channel with highest Q -value and broadcast channels. In addition, considering the mobility of both the primary
to its neighbors along with PCL. Nodes after receiving broadcast users and secondary users, we assume that both PUs and SUs follow
information update PCL table and calculate population state of the homogeneous Poisson point process (HPPP). Let {N(A)} denotes
each channel, i.e. number of users selecting a given channel. Nodes the number of users in the area ‘‘A’’, such as the cell in Fig. 2. If {N
select the channel with least population state for the transmission. (A)} follows an HPPP with the intensity of λ > 0, that is, N(A) ∼
Every node maintains Q-table that consists of Q -values which are Poisson (λ| A|), then the probability of N(A)= k can be expressed
in the range of 0 to 1. We use dynamic Q-table that the size of as:
Q-table of the node is determined by the number of available e−λ|A| (λ|A|)k
channels. The Q-table and the learning task are distributed among P(N(A) = k) = . (3)
k!
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.
4 A. Carie et al. / Future Generation Computer Systems ( ) –
Fig. 4. Node communication using hybrid control channel using learning and power
control.
where n is a zero-mean independent complex Gaussian noise with For, i = 1, 2, . . . , m where γs is the channel gain of the SU link.
unit variance; h ∈ C M ×1 is the channel vector between the Therefore, by solving the above optimization, we can finally obtain
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.
A. Carie et al. / Future Generation Computer Systems ( ) – 5
Table 1
Simulation parameters.
Parameter name Description
Topology 1000 *1000 Flatgrid
No. of CR nodes 100
No. of PU channels 10(8 MHz) channels
No. of PU transmitters 10
Unlicensed channel ISM-902 MHz
PU active probability 10, 15 and 20 msec
Mobility model Random waypoint
Input CR transmit power 10 mW
Receiver threshold −95 dbm
Carrier Sense threshold −115 dbm
CR Tx range 200 m (Licensed channel)
PU Tx range 500 m (Licensed channel)
Data rate 2 Mbps
Antenna type(Channel reservation) Directional (4-element)
Antenna type(Data transmission) Directional(4-element)
Beamwidth 900
Interface Queue length 50
Simulation time (s) 100 Sec
Traffic type CBR/UDP
Packet size (bytes) 512 & 1024 bytes
6. Simulation results
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.
6 A. Carie et al. / Future Generation Computer Systems ( ) –
Acknowledgment
References
[1] J. Mitola, Cognitive radio: An integrated agent architecture for software de-
Fig. 7. (a) Aggregate Throughput with respect to data rate(PU channels :5) (b) fined radio (Ph.D. dissertation), KTH Royal Institute of Technology, 2000.
Average packet delay with respect to data rate(PU channels : 5) (c). Aggregate [2] S. Srinivasa, S.A. Jafar, Soft sensing and optimal power control for cognitive
Network Throughput with respect to data rate (PU channels : 10) (d) Average packet radio, IEEE Trans. Wireless Commun. 9 (12) (2010) 3638–3649.
delay with respect to MAC data rate (PU channels: 10). [3] V. Asghari, S. Aissa, Adaptive rate and power transmission in spectrum-sharing
systems, IEEE Trans. Wireless Commun. 9 (10) (2010) 3272–3280.
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.
A. Carie et al. / Future Generation Computer Systems ( ) – 7
[4] E.C.Y. Peh, Y.-C. Liang, Y. Zeng, Sensing and power control in cognitive radio [29] B. Maham, P. Popovski, X. Zhou, A. Hjorungnes, Cognitive multiple access net-
with location information, in ICCS, 2012, pp. 255–259. works with outage margin in the primary system, IEEE Trans. Wirel. Commun.
[5] L.-C. Wang, A. Chen, Effects of location awareness on concurrent transmissions 10 (10) (2011) 3343–3353.
for cognitive ad hoc networks overlaying infrastructure-based systems, IEEE [30] E. Dall’Anese, S.-J. Kim, G.B. Giannakis, S. Pupolin, Power control for cognitive
Trans. Mobile Comput. 8 (5) (2009) 577–589. radio networks under channel uncertainty, IEEE Trans. Wirel. Commun. 10 (10)
[6] Y. Song, J. Xie, Optimal power control for concurrent transmissions of location- (2011) 3541–3551.
aware mobile cognitive radio ad hoc networks, in GLOBECOM, no. July 2009, [31] X. Gong, S. Vorobyov, C. Tellambura, Optimal bandwidth and power allocation
pp. 1–6. for sum ergodic capacity under fading channels in cognitive radio networks,
[7] M.R. Hassan, G. Karmakar, J. Kamruzzaman, Maximizing the concurrent trans- IEEE Trans. Signal Process. 59 (4) (2011) 1814–1826.
missions in cognitive radio ad hoc networks, in IWCMC, no. July 2011, pp. 466– [32] C.C. Zarakovitis, N. Qiang, D.E. Skordoulis, M.G. Hadjinicolaou, Power-efficient
471. cross-layer design for OFDMA systems with heterogeneous QoS, imperfect CSI,
[8] S.M. S’anchez, R.D. Souza, E.M.G. Fernandez, V.A. Reguera, W. Godoy, Effect of and outage considerations, IEEE Trans. Veh. Technol. 61 (2) (2012) 781–798.
location accuracy and shadowing on the probability of non-interfering con- [33] A. Behzad, Z. Rubin, Multiple access protocol for power-controlled wireless
current transmissions in cognitive ad hoc networks, Radioengineering 22 (4) access nets, IEEE Trans. Mob. Comput. 3 (4) (2004) 307–316.
(2013) 1138–1149. [34] M. Ku, L. Wang, Y.T. Su, Toward optimal multiuser antenna beamforming for
[9] S.M. S’anchez, R.D. Souza, E.M.G. Fernandez, V.A. Reguera, Impact of rate hierarchical cognitive radio systems, IEEE Trans. Commun. 60 (10) (2012)
control on the performance of a cognitive radio adhoc network, IEEE Commun. 2872–2885.
Lett. 16 (9) (2012) 1424–1427. [35] P. Li, C. Zhang, Y. Fang, The capacity of wireless ad hoc networks using
[10] S.M. Sanchez, R.D. Souza, E.M.G. Fernandez, V.A. Reguera, Rate and energy directional antennas, IEEE Trans. Mobile Comput. 10 (10) (2011) 1374–1387.
efficient power control in a cognitive radio ad hoc network, IEEE Signal Process. [36] Y. Chen, J. Liu, X. Jiang, O. Takahashi, Throughput analysis in mobile ad hoc
Lett. 20 (2013) 451–454. networks with directional antennas, Ad Hoc Networks 11 (3) (2013) 1122–
[11] S. Buzzi, D. Saturnino, A game-theoretic approach to energyefficient power 1135.
control and receiver design in cognitive CDMA wireless networks, IEEE J. Sel. [37] R. Choudhury, N.H. Vaidya, Impact of Directional Antennas on ad hoc Routing.
Topics Signal Process. 5 (1) (2011) 137–150. In Conference on Personal and Wireless Communication, September 2003.
[12] M. Ku, L. Wang, Y.T. Su, Toward optimal multiuser antenna beamforming for [38] J. Zander, Slotted Aloha multihop packet radio networks with directional
hierarchical cognitive radio systems, IEEE Trans. Commun. 60 (10) (2012) antennas, Electron. Lett. 26 (25) (1990).
2872–2885. [39] Y.B. Ko, V. Shankarkumar, N.H. Shankarkumar, Medium access control proto-
[13] Proceedings of the first IEEE symposium on New Frontiers in Dynamic Spec- cols using directional antennas in ad hoc networks. In Annual Joint Conference
trum Access Networks, November 2005. of the IEEE Computer and Communications Societies, March 2000.
[14] M. Chiang, P. Hande, T. Lan, C.W. Tan, Power control in wireless cellular [40] A. Chia-Chun Hsu, David S.L. Wei, C.-C. Jay Kuo, A cognitive radio MAC protocol
networks, Found. Trends Netw. 2 (4) (2008) 381–533. using statistical channel allocation for wireless ad hoc networks, In Proc. IEEE
[15] Z. Ji, K.J.R. Liu, Dynamic spectrum sharing: A game theoretical overview, IEEE WCNC, March 2007, pp. 105–110.
Comm. Mg. 45 (5) (2007) 88–94. [41] S.L. Wu, C.Y. Lin, Y.C. Tseng, J.P. Sheu, A new multi-channel MAC protocol with
[16] D. Niyato, E. Hossain, Competitive spectrum sharing in cognitive radio on-demand channel assignment for multi-hop mobile ad hoc networks, IEEE
networks: A dynamic game approach, IEEE T. Wls. Comm. 7 (7) (2008) DySPAN, Maryland, USA, 2005, pp. 203–213.
2651–2660. [42] S.J. Yoo, H. Nan, T.I. Hyon, DCR-MAC: Distributed cognitive radio MAC protocol
[17] N. Nie, C. Comaniciu, Adaptive channel allocation spectrum etiquette for for wireless ad hoc networks, WirelCommun Mobile Comput. 9 (5) (2009) 631–
cognitive radio networks, Symp. on New Frntr. in Dynmc. Spctrm. Acs. Nwk. 653.
(DySPAN), IEEE, Baltimore, MD, 2005, pp. 269–278. [43] J. Jia, Q. Zhang, X. Shen, HC-MAC: A hardware-constrained cognitive MAC for
[18] M.R. Musku, A.T. Chronopoulos, S. Penmatsa, D.C. Popescu, A game theoretic efficient spectrum management, IEEE J. Sel. Areas Commun. 26 (1) (2008) 106–
approach for medium access of open spectrum in cognitive radios, 2nd Intl. 117.
Conf. on Cgntve. Rd. Orntd. Wls. Nwk. and Comm. (CROWNCOM), IEEE, Or- [44] H.A.B. Salameh, M.M. Krunz, O. Younis, Cooperative adaptive spectrum
lando, FL, July 2007, pp. 336–341. sharingin cognitive radio networks, IEEE/ACM Trans. Netw. 18 (4) (2010)
[19] Z. Han, C. Pandana, K.J.R. Liu, Distributive opportunistic spectrum access for 1181–1194.
cognitive radio using correlated equilibrium and no-regret learning, Wls. [45] X. Wang, A. Wong, P.H. Ho, Stochastic medium access for cognitive radio ad
Comm. Nwk. Conf. (WCNC), IEEE, Hong Kong, March 2007, pp. 11–15. hoc networks, IEEE J. Sel. Areas Commun. 29 (4) (2011) 770–783.
[20] Z. Ji, K.J.R. Liu, Dynamic spectrum sharing: A game theoretical overview, IEEE [46] Y. Yilmaz, Z. Guo, X. Wang, Sequential joint spectrum sensing and channel
Comm. Mg. 45 (5) (2007) 88–94. estimation for dynamic spectrum access, IEEE J. Sel. Areas Commun. 32 (11)
[21] S. Subramani, T. Basar, S. Armour, D. Kaleshi, Z. Fan, Noncooperative equilib- (2014) 2000–2012.
rium solutions for spectrum access in distributed cognitive radio networks, [47] N. Khambekar, C.M. Spooner, V. Chaudhary, On improving serviceability with
Symp. On New Frntr. in Dynmc. Spctrm. Acs. Nwk. (DySPAN), IEEE, Chicago, IL, quantified dynamic spectrum access, in Proceedings of the IEEE International
October 2008. Symposium on Dynamic Spectrum Access Networks (DySPAN ’14), pp. 553–
[22] H.N. Pham, J. Xiang, Y. Zhang, T. Skeie, QoS-aware channel selection in cog- 564, McLean, Va, USA, April 2014.
nitive radio networks: A game-theoretic approach, Glbl. Telecomm. Conf. [48] T.M.C. Chu, H. Phan, H.J. Zepernick, Hybrid interweave underlay spectrum
(GLOBECOM), IEEE, New Orleans, LA, December 2008. access for cognitive cooperative radio networks, IEEE Trans. Commun. 62 (7)
[23] H. Qin, H. Wang, H. Zhou, A selfish game-theoretic approach for cognitive radio (2014) 2183–2197.
networks with dynamic spectrum sharing, Intl. Conf. on Comp. Sc. Sftwr. Eng. [49] P. Li, C. Zhang, Y. Fang, The capacity of wireless ad hoc networks using
(CSSE), China, December 2008, pp. 1105–1109. directional antennas, IEEE Trans. Mobile Comput. 10 (10) (2011) 1374–1387.
[24] X. Gong, W. Yuan, W. Liu, W. Cheng, S. Wang, A cooperative relay scheme for [50] Y. Chen, J. Liu, X. Jiang, O. Takahashi, Throughput analysis in mobile ad hoc
secondary communication in cognitive radio networks, Glbl. Telecomm. Conf. networks with directional antennas, Ad Hoc Networks 11 (3) (2013) 1122–
(GLOBECOM), IEEE, New Orleans, LA, December 2008. 1135.
[25] V. Maskery, V. Krishnamurthy, Q. Zhao, Decentralized dynamic spectrum
access for cognitive radios: Cooperative design of a noncooperative game, IEEE Anil Carie received the B.Tech. Degree in Computer Sci-
T. Comm. 57 (2) (2009) 459–469. ence and Engineering from Jawaharlal Nehru Technolog-
[26] H.-P. Shiang, M.V.D. Schaar, Queuing-based dynamic channel selection for ical University, Hyderabad, India and M-Tech degree in
heterogeneous multimedia applications over cognitive radio networks, IEEE software Engineering from Karunya University, Coimbat-
T. Mm 10 (5) (2008) 896–909. ore, India. He worked as faculty in Vidya Barathi Insti-
[27] S. Srinivasa, S.A. Jafar, Soft sensing and optimal power control for cognitive tute of Technology, Warangal, India from Jun-2010 to
Jun-2014. He is currently a Ph.D. candidate in School of
radio, IEEE Trans. Wireless Commun. 9 (12) (2010) 3638–3649.
Software Technology of Dalian University of Technology,
[28] A. Hoang, Y. Liang, M. Islam, Power control and channel allocation in cognitive
China. His research interests include Common control
radio networks with primary users’ cooperation, IEEE Trans. Mob. Comput. channel design for MAC and routing protocols in Cognitive
9 (3) (2010) 348–360. radio ad-hoc networks.
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.
8 A. Carie et al. / Future Generation Computer Systems ( ) –
Mingchu Li received the B.S. degree in mathematics, CH.R. Prakasha Reddy is currently working as a Lecturer
Jiangxi Normal University and the M.S. degree in applied at College of Engineering and Technology in Department of
science, University of Science and Technology Beijing in Informatics, Wollega University T.R, Ethiopia. He received
1983 and 1989, respectively. He worked for University of his Bachelor degree in Information Technology (2007)
Science and Technology Beijing as an associate professor from Department of Information Technology, St.Theressa
from 1989 to 1994. He received his doctorate in Math- Institute of Engineering and Technology, Jawaharlal Nehru
ematics, University of Toronto in 1998. He worked for Technological University, Andhra Pradesh, India. He Joined
School of Software of Tianjin University as a full professor Karunya University Coimbatore, India for Masters in Net-
(from 2002 to 2004). and, from 2004 to now, for School of work and Internet Engineering (2010). He has worked as
Software Technology of Dalian University of Technology as Assistant Professor at Department of Information Tech-
a full Professor and Vice Dean. His main research interests nology, SRK Institute of Technology, Vijayawada, Andhra
include theoretical computer science and information security, and trust models Pradesh, (2010 – 2013) India. His main Research interests include Storage Area Net-
and cooperative game theory. works, Network security, Energy Efficient Routing in wireless Networks, Artificial
Intelligence.
Please cite this article in press as: A. Carie, et al., Hybrid Directional CR-MAC based on Q-Learning with Directional Power Control, Future Generation Computer Systems
(2017), https://doi.org/10.1016/j.future.2017.11.014.