Extraction of Scene Text Information from Video

Full Text (PDF, 1045KB), PP.15-26

Views: 0 Downloads: 0


Too Kipyego Boaz 1,* Prabhakar C.J. 1

1. Department of Computer science, Kuvempu University, Shivammoga, Karnataka, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijigsp.2016.01.02

Received: 14 Sep. 2015 / Revised: 15 Oct. 2015 / Accepted: 1 Dec. 2015 / Published: 8 Jan. 2016

Index Terms

Natural Scene, Text Information Extraction, Stereo Frames


In this paper, we present an approach for scene text extraction from natural scene video frames. We assumed that the planar surface contains text information in the natural scene, based on this assumption, we detect planar surface within the disparity map obtained from a pair of video frames using stereo vision technique. It is followed by extraction of planar surface using Markov Random Field (MRF) with Graph cuts algorithm where planar surface is segmented from other regions. The text information is extracted from reduced reference i.e. extracted planar surface through filtering using Fourier-Laplacian algorithm. The experiments are carried out using our dataset and the experimental results indicate outstanding improvement in areas with complex background where conventional methods fail.

Cite This Paper

Too Kipyego Boaz, Prabhakar C. J.,"Extraction of Scene Text Information from Video", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.8, No.1, pp.15-26, 2016. DOI: 10.5815/ijigsp.2016.01.02


[1]Smith, Michael A., and Takeo Kanade, "Video skimming for quick browsing based on audio and image characterization", School of Computer Science, Carnegie Mellon University, 1995.

[2]Epshtein Boris, Eyal Ofek and Yonatan Wexler. "Detecting text in natural scenes with stroke width transform." Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on. IEEE, 2010.

[3]Jung Keechul, Kwang In Kim, and Anil K Jain, "Text information extraction in images and video: a survey." Pattern recognition 37.5 (2004): 977-997.

[4]Shahab A., Shafait F. and Dengel, A., "ICDAR 2011 robust reading competition challenge 2: Reading text in scene images", International Conference on Document Analysis and Recognition (ICDAR), IEEE, pages 1491-1496, 2011.

[5]Shivakumara Palaiahnakote, Trung Quy Phan and Chew Lim Tan, "A laplacian approach to multi-oriented text detection in video," Pattern Analysis and Machine Intelligence, IEEE Transactions on 33.2 (2011): 412-419.

[6]Bobick Aaron F. and Stephen S. Intille, "Large occlusion stereo." International Journal of Computer Vision 33.3 (1999): 181-200.

[7]Corso Jason, Darius Burschka and Gregory Hager. "Direct plane tracking in stereo images for mobile navigation." Robotics and Automation, 2003. Proceedings. ICRA'03. IEEE International Conference on. Vol. 1. IEEE, 2003.

[8]Konolige K., Agrawal, M., Bolles, R. C., Cowan, C., Fischle, M., & Gerkey, B. (2008, January). "Outdoor mapping and navigation using stereo vision", In Experimental Robotics (pp. 179-190). Springer Berlin Heidelberg.

[9]Jeffrey A. Delmerico,, Jason J. Corso, and Philip David. "Boosting with stereo features for building facade detection on mobile platforms." Image Processing Workshop (WNYIPW), 2010 Western New York. IEEE, 2010.

[10]Zhang, Shujun, Jianbo Zhang, and Yun Liu. "A Window-Based Adaptive Correspondence Search Algorithm Using Mean Shift and Disparity Estimation." Virtual Reality and Visualization (ICVRV), 2011 International Conference on. IEEE, 2011.

[11]Huang W., Lin Z., Jianchao Y., and Wang, J., "Text localization in natural images using stroke feature transform and text covariance descriptors," IEEE International Conference on Computer Vision (ICCV), pages 1241-1248, 2013.

[12]Lu S., Chen, T., Shangxuan, T., Joo-Hwee L., and Chew-Lim T., "Scene text extraction based on edges and support vector regression," International Journal on Document Analysis and Recognition (IJDAR), pages 1-11, 2015.

[13]Yi F. Pan, X. Hou and C.L. Liu, "Text localization in natural scene images based on conditional random field", In ICDAR 2009, IEEE Computer Society, pages 6–10, 2009.

[14]Yin X., Xuwang Y., Huang K. and Hong-Wei H. "Robust text detection in natural scene images", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, issue 5, pages 970-983, 2014.

[15]Iqbal K., Xu-Cheng Y., Hong-Wei H., Sohail A. and Hazrat, A. "Bayesian net-work scores based text localization in scene images", International Joint Conference on Neural Networks (IJCNN), pages 2218-2225, 2014.

[16]Boaz T.K. and Prabhakar C.J., "Quality Assessment of Stereo Images using Reduced Reference Based on Saliency Region," International conference on contemporary computing and informatics, IEEE, pages 503-508, 2014.

[17]M. A. Fischler and R. C. Bolles, "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography," Commun. ACM, vol. 24, no. 6, pp. 381–395, 1981.

[18]Szeliski, Richard, et al. "A comparative study of energy minimization methods for Markov random fields." Computer Vision–ECCV 2006. Springer Berlin Heidelberg, 2006. 16-29.

[19]Boykov, Yuri, Olga Veksler, and Ramin Zabih. "Fast approximate energy minimization via graph cuts." Pattern Analysis and Machine Intelligence, IEEE Transactions on 23.11 (2001): 1222-1239.

[20]Liu X., and Samarabandu J., "An edge-based text region extraction algorithm for indoor mobile robot navigation," IEEE International Conference on Mechatronics and Automation, Vol. 2, pages 701-706, 2005.

[21]Zhong Yu, Kalle Karu and Anil K. Jain. "Locating text in complex color images." Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on. Vol. 1. IEEE, 1995.