Collaborative Anti-jamming in Cognitive Radio Networks Using Minimax-Q Learning

Full Text (PDF, 204KB), PP.11-18

Views: 0 Downloads: 0


Sangeeta Singh 1,* Aditya Trivedi 2 Navneet Garg 3

1. Department of Electronics & Communication Engineering PDPM-IIITDM, Jabalpur, 482005, India

2. Department of Digital Communication ABV-IIITM, Gwalior, 474010, India

3. Department of Electrical Engineering IIT Kanpur, 208016, India

* Corresponding author.


Received: 7 Jun. 2013 / Revised: 5 Jul. 2013 / Accepted: 2 Aug. 2013 / Published: 8 Sep. 2013

Index Terms

Cognitive radio networks, Stochastic game theory, Collaborative games, Markov decision process, Reinforcement learning


Cognitive radio is an efficient technique for realization of dynamic spectrum access. Since in the cognitive radio network (CRN) environment, the secondary users (SUs) are susceptible to the random jammers, the security issue of the SU’s channel access becomes crucial for the CRN framework. The rapidly varying spectrum dynamics of CRN along with the jammer’s actions leads to challenging scenario. Stochastic zero-sum game and Markov decision process (MDP) are generally used to model the scenario concerned. To learn the channel dynamics and the jammer’s strategy the SUs use reinforcement learning (RL) algorithms, like Minimax-Q learning. In this paper, we have proposed the multi-agent multi-band collaborative anti-jamming among the SUs to combat single jammer using the Minimax-Q learning algorithm. The SUs collaborate via sharing the policies or episodes. Here, we have shown that the sharing of the learned policies or episodes enhances the learning probability of SUs about the jammer’s strategies but reward reduces as the cost of communication increases. Simulation results show improvement in learning probability of SU by using collaborative anti-jamming using Minimax-Q learning over single SU fighting the jammer scenario.

Cite This Paper

Sangeeta Singh, Aditya Trivedi, Navneet Garg, "Collaborative Anti-jamming in Cognitive Radio Networks Using Minimax-Q Learning", International Journal of Modern Education and Computer Science (IJMECS), vol.5, no.9, pp.11-18, 2013. DOI:10.5815/ijmecs.2013.09.02


[1]S. Haykin, “Cognitive radio: brain-empowered wireless communications,” IEEE Journal on, Selected Areas in Communications, vol. 23, no. 2, pp. 201–220, 2005.
[2]I. Akyildiz, W. Lee, M. Vuran, and S. Mohanty, “Next generation/ dynamic spectrum access/cognitive radio wireless networks: a survey,” Computer Networks, vol. 50, no. 13, pp. 2127–2159, 2006.
[3]M. Littman and C. Szepesv´ari, “A generalized reinforcement-learning model: Convergence and applications,” in MACHINE LEARNINGINTERNATIONAL WORKSHOP THEN CONFERENCE. Citeseer, pp. 310–318, 1996.
[4]J. Mertens and A. Neyman, “Stochastic games,” International Journal of Game Theory, vol. 10, no. 2, pp. 53–66, 1981.
[5]A. Neyman and S. Sorin, Stochastic games and applications. Springer Netherlands, vol. 570, 2003.
[6]G. Rummery and M. Niranjan, On-line Q-learning using connectionist systems. Univ. of Cambridge, Department of Engineering, 1994.
[7]M. Littman, “Markov games as a framework for multi-agent reinforcement learning,” in Proceedings of the eleventh international conference on machine learning. Citeseer vol.157163, 1994.
[8]J. Filar and K. Vrieze, Competitive Markov decision processes. Springer Verlag, 1997.
[9]M. Wiering, “QV (lambda)-learning: A new on-policy reinforcement learning algorithm,” In D. Leone, editor, Proceedings of the 7th European Workshop on Reinforcement Learning, pages 2930, 2005.
[10]C. Claus and C. Boutilier, “The dynamics of reinforcement learning in cooperative multiagent systems,” in Proceedings of the National Conference on Artificial Intelligence. JOHN WILEY & SONS LTD, pp. 746–752, 1998.
[11]R. Sutton and A. Barto, Introduction to reinforcement learning. MIT Press, 1998.
[12]L. Matignon, G. Laurent, and N. Le Fort-Piat,“Independent reinforcement learners in cooperative markov games: a survey regarding coordination problems,” The Knowledge Engineering Review, vol. 27, no. 01, pp. 1–31, 2012.
[13]K. Liu and B.Wang, Cognitive Radio Networking and Security: A Game theoretic View. Cambridge Univ Pr, 2010.
[14]B. Wang, Y. Wu, and K. Liu, “Game theory for cognitive radio networks: An overview,” Computer Networks, vol. 54, no. 14, pp. 2537–2561, 2010.
[15]B. Wang, Y. Wu, K. Liu, and T. Clancy, “An anti-jamming stochastic game for cognitive radio networks,” IEEE Journal on, Selected Areas in Communications, vol. 29, no. 4, pp. 877–889, 2011.
[16]M. Wiering and H. van Hasselt, “The QV family compared to other reinforcement learning algorithms,” in IEEE Symposium on, Adaptive Dynamic Programming and Reinforcement Learning, ADPRL, pp. 101– 108, 2009.
[17]M. Wiering and H. Van Hasselt, “Two novel on-policy reinforcement learning algorithms based on TD (λ)-methods,” in IEEE International Symposium on, Approximate Dynamic Programming and Reinforcement Learning, ADPRL., pp. 280–287, 2007.
[18]M. Tan, “Multi-agent reinforcement learning: Independent v/s. cooperative agents,” in Proceedings of the tenth international conference on machine learning, vol. 337. Amherst, MA, 1993.
[19]M. Veloso, “An analysis of stochastic game theory for multiagent reinforcement learning.” ICML, 2000.