Image Semantic Segmentation Using Deep Learning

Full Text (PDF, 1151KB), PP.1-10

Views: 0 Downloads: 0


Vihar Kurama 1,* Samhita Alla 1 Rohith Vishnu K 1

1. Chaitanya Bharathi Institute of Technology, Hyderabad - 500075, India.

* Corresponding author.


Received: 8 Jul. 2018 / Revised: 5 Sep. 2018 / Accepted: 16 Oct. 2018 / Published: 8 Dec. 2018

Index Terms

Artificial Neural Networks, Image Segmentation, Computer Vision, Artificial Intelligence, Convolutional Neural Networks


In the fields of Computer Vision, Image Semantic Segmentation is one of the most focused research areas. These are widely used for several real-time problems for finding the foreground or background scenes of a given image or a video.  Initially, it is achieved using computer vision techniques, later once the deep learning is in its rise, ultimately it took over the entire image classification and segmentation techniques. These are widely surveyed and reviewed as they are used in several Image Processing, Feature Detection and Medical Fields. All the models for implementing Image Segmentation are mostly done using a specific neural network architecture called a convolution neural network. In this work, firstly we'll study the implementation of Image Segmentation models and advantages, disadvantages over one another including their development trends. We'll be discussing all the models and their applications concerning other fancy methods that are mostly used which involves hyperparameters and the transitive comparison between them. 

Cite This Paper

Vihar Kurama, Samhita Alla, Rohith Vishnu K, " Image Semantic Segmentation Using Deep Learning", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.10, No.12, pp. 1-10, 2018. DOI: 10.5815/ijigsp.2018.12.01


[1]Generative Adversarial Networks: Introduction and Outlook Kunfeng Wang, Member, IEEE, Chao Gou, Yanjie Duan, Yilun Lin, Xinhu Zheng, and Fei-Yue Wang, Fellow, IEEE.

[2]Fully Convolutional Networks for Semantic Segmentation, Jonathan Long, Evan Shelhamer UC Berkeley, Trevor Darrell.

[3]O. Matan, C. J. Burges, Y. LeCun, and J. S. Denker. Multi- digit recognition using a space displacement neural network. In NIPS, pages 488–495. Citeseer, 1991.

[4]Y. LeCun, B. Boser, J. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. In Neural Computation, 1989.

[5]Learning representations by back-propagating errors David E. Rumelhart, Geoffrey E. Hinton & Ronald J. Williams, Nature Volume 323, pages 533–536 (09 October 1986)

[6]F. Ning, D. Delhomme, Y. LeCun, F. Piano, L. Bottou, and P. E. Barbano. Toward automatic phenotyping of developing embryos from videos. Image Processing, IEEE Transactions on, 14(9):1360–1371, 2005.

[7]D.C.Ciresan,A.Giusti,L.M.Gambardella,and J.Schmid-huber. Deep neural networks segment neuronal membranes in electron microscopy images. In NIPS, pages 2852–2860, 2012.

[8]C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 2013.

[9]P. H. Pinheiro and R. Collobert. Recurrent convolutional neural networks for scene labeling. In ICML, 2014.

[10]B. Hariharan, P. Arbela ́ez, R. Girshick, and J. Malik. Simultaneous detection and segmentation. In European Conference on Computer Vision (ECCV), 2014.

[11]S. Gupta, R. Girshick, P. Arbelaez, and J. Malik. Learning rich features from RGB-D images for object detection and segmentation. In ECCV. Springer, 2014.

[12]Y. Ganin and V. Lempitsky. N4-fields: Neural network nearest neighbor fields for image transforms. In ACCV, 2014.

[13]U-Net: Convolutional Networks for Biomedical Image Segmentation Olaf Ronneberger, Philipp Fischer, Thomas Brox  arXiv:1505.04597

[14]Ciresan, D.C., Gambardella, L.M., Giusti, A., Schmidhuber, J.: Deep neural networks segment neuronal membranes in electron microscopy images. In: NIPS. pp. 2852–2860 (2012)

[15]SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation, Vijay Badrinarayanan, Alex Kendall, Roberto Cipolla, Senior Member, IEEE.

[16]K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014.

[17]Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in CVPR, pp. 3431–3440, 2015.

[18]L. Bottou, “Large-scale machine learning with stochastic gradient de- scent,” in Proceedings of COMPSTAT’2010, pp. 177–186, Springer, 2010.

[19]J. Shotton, M. Johnson, and R. Cipolla, “Semantic texton forests for image categorization and segmentation,” in CVPR, 2008.

[20]G.Brostow,J.Shotton,J.,andR.Cipolla,“Segmentationandrecognition using structure from motion point clouds,” in ECCV, Marseille, 2008.

[21]P. Sturgess, K. Alahari, L. Ladicky, and P. H.S.Torr, “Combining appear- ance and structure from motion features for road scene understanding,” in BMVC, 2009.

[22]L. Ladicky, P. Sturgess, K. Alahari, C. Russell, and P. H. S. Torr, “What, where and how many? combining object detectors and crfs,” in ECCV, pp. 424–437, 2010.

[23]H. Noh, S. Hong, and B. Han, “Learning deconvolution network for semantic segmentation,” in ICCV, pp. 1520–1528, 2015.

[24]S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” CoRR, vol. abs/1502.03167, 2015.

[25]DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs Liang-Chieh Chen, George Papandreou, Senior Member, IEEE, Iasonas Kok

[26]M. Holschneider, R. Kronland-Martinet, J. Morlet, and P. Tchamitchian, “A real-time algorithm for signal analysis with the help of the wavelet transform,” in Wavelets: Time-Frequency Methods and Phase Space, 1989, pp. 289–297

[27]J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders. Selective search for object recognition. IJCV, 2013

[28]Pedro F. Felzenszwalb, Efficient Graph-Based Image Segmentation, Lecture notes in computer science, Artificial Intelligence Lab, Massachusetts Institute of Technology

[29]A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.

[30]P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan.Object detection with discriminatively trained part based models. TPAMI, 2010.

[31]Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. In Proc. of the ACM International Conf. on Multimedia, 2014.

[32]M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu. Spatial transformer networks. In NIPS, 2015.