Temporal Difference based Tuning of Fuzzy Logic Controller through Reinforcement Learning to Control an Inverted Pendulum

Full Text (PDF, 628KB), PP.15-21

Views: 0 Downloads: 0


Raj Kumar 1,* M. J. Nigam 1 Sudeep Sharma 1 Punitkumar Bhavsar 1

1. System Modeling and Control Group, Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, INDIA

* Corresponding author.

DOI: https://doi.org/10.5815/ijisa.2012.09.02

Received: 10 Oct. 2011 / Revised: 3 Feb. 2012 / Accepted: 11 Apr. 2012 / Published: 8 Aug. 2012

Index Terms

Reinforcement Learning, Q-learning, Inverted Pendulum, Fuzzy logic control, Temporal Difference


This paper presents a self-tuning method of fuzzy logic controllers. The consequence part of the fuzzy logic controller is self-tuned through the Q-learning algorithm of reinforcement learning. The off policy temporal difference algorithm is used for tuning which directly approximate the action value function which gives the maximum reward. In this way, the Q-learning algorithm is used for the continuous time environment. The approach considered is having the advantage of fuzzy logic controller in a way that it is robust under the environmental uncertainties and no expert knowledge is required to design the rule base of the fuzzy logic controller.

Cite This Paper

Raj kumar, M. J. Nigam, Sudeep Sharma, Punitkumar Bhavsar, "Temporal Difference based Tuning of Fuzzy Logic Controller through Reinforcement Learning to Control an Inverted Pendulum", International Journal of Intelligent Systems and Applications(IJISA), vol.4, no.9, pp.15-21, 2012. DOI:10.5815/ijisa.2012.09.02


[1]J. Yi, and N. Yubazaki, “Stabilization fuzzy control of inverted pendulum systems”, Artificial Intelligence in Engineering, vol. 14, pp. 153-163. Feb. 2000.

[2]R. F. Harrison, “Asymptotically optimal stabilizing quadratic control of an inverted pendulum”. IEEE Proceedings: Control Theory and Applications, vol. 150, pp. 7-16, Mar. 2003.

[3]Googol Technology Inverted pendulum Experiment Manual. Googol Inc, Second Edition, July, 2006.

[4]Richard S. Sutton and Andrew G. Barto, "Reinforcement Learning". MIT Press, Cambridge, MA. 1998.

[5]A.G.Barto, R. S. Sutton, C. W. Anderson “Neuron-like adaptive elements that can solve difficult learning control problems,” IEEE Transaction on System, Man, and Cybernetics,vol.SMC-13, no. 5, pp. 835–846, 

[6]William M. Hinojosa, Samia Nefti, Systems Control With Generalized Probabilistic Fuzzy Reinforcement Learning, IEEE Transactions On Fuzzy Systems, Vol. 19, No. 1, February 2011

[7]C. W. Anderson, “Learning to control an inverted pendulum using neural networks,” IEEE Control System Mag., vol. 9 , no. 3, pp. 31–37, Apr. 1989

[8]Hitoshi Iima and Yasuaki Kuroe, "Swarm Reinforcement Learning algorithms based on Sarsa method," SICE Annual Conference 2008, August 20-22, 2008

[9]C. T. Lin, “A neural fuzzy control system with structure and parameter learning,” Fuzzy Sets Systems, vol. 70, no. 2/3, pp. 183–212, Mar. 1995

[10]Dávid Vincze, Szilveszter Kovács, "Fuzzy Rule Interpolation-based Q-learning," 5th International Symposium on Applied Computational Intelligence and Informatics, May 28–29, 2009 – Timişoara, Romania