The Method of Semantic Image Segmentation Using Neural Networks

Full Text (PDF, 802KB), PP.1-14

Views: 0 Downloads: 0


Ihor Tereikovskyi 1,* Denys Chernyshev 3 Liudmyla Tereikovska 3 Oleksandr Korystin 4 Oleh Tereikovskyi 5 Zhengbing Hu 2

1. National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute", Kyiv, Ukraine

2. School of Computer Science, Hubei University of Technology, Wuhan, China

3. Kyiv National University of Construction and Architecture, Kyiv, Ukraine

4. Scientifically Research Institute of the Ministry of Internal Affairs, Kyiv, Ukraine

5. National Aviation University, Kyiv, Ukraine

* Corresponding author.


Received: 12 Aug. 2022 / Revised: 13 Sep. 2022 / Accepted: 15 Oct. 2022 / Published: 8 Dec. 2022

Index Terms

Semantic Segmentation Method, Convolutional Neural Network, Encoder, Decoder, Neural Network Model Efficiency, Segmentation Accuracy.


Currently, the means of semantic segmentation of images, which are based on the use of neural networks, are increasingly being used in computer systems for various purposes. Despite significant progress in this industry, one of the most important unsolved problems is the task of adapting a neural network model to the conditions for selecting an object mask in an image. The features of such a task necessitate determining the type and parameters of convolutional neural networks underlying the encoder and decoder. As a result of the research, an appropriate method has been developed that allows adapting the neural network encoder and decoder to the following conditions of the segmentation problem: image size, number of color channels, acceptable minimum segmentation accuracy, acceptable maximum computational complexity of segmentation, the need to label segments, the need to select several segments, the need to select deformed , displaced and rotated objects, allowable maximum computational complexity of training a neural network model, allowable training time for a neural network model. The main stages of the method are related to the following procedures: determination of the list of image parameters to be registered; formation of training example parameters for the neural network model used for object selection; determination of the type of CNN encoder and decoder that are most effective under the conditions of the given task; formation of a representative educational sample; substantiation of the parameters that should be used to assess the accuracy of selection; calculation of the values of the design parameters of the CNN of the specified type for the encoder and decoder; assessment of the accuracy of selection and, if necessary, refinement of the architecture of the neural network model. The developed method was verified experimentally on examples of semantic segmentation of images containing objects such as a car. The obtained experimental results show that the application of the proposed method allows, avoiding complex long-term experiments, to build a NN that, with a sufficiently short training period, ensures the achievement of image segmentation accuracy of about 0.8, which corresponds to the best systems of similar purpose. It is shown that it is advisable to correlate the ways of further research with the development of approaches to the use of special modules such as ResNet, Inception and mechanisms of the Partial convolution type used in modern types of deep neural networks to increase their computational efficiency in the encoder and decoder.

Cite This Paper

Ihor Tereikovskyi, Zhengbing Hu, Denys Chernyshev, Liudmyla Tereikovska, Oleksandr Korystin, Oleh Tereikovskyi, "The Method of Semantic Image Segmentation Using Neural Networks", International Journal of Image, Graphics and Signal Processing(IJIGSP), Vol.14, No.6, pp. 1-14, 2022. DOI:10.5815/ijigsp.2022.06.01


[1]Abraham J., Paul V. “An imperceptible spatial domain color image watermarking scheme”. Journal of King Saud University – Computer and Information Sciences. 2019. Vol. 31 (1), pp. 125-133.

[2]Adithya U., Nagaraju C., “Object Motion Direction Detection and Tracking for Automatic Video Surveillance”, International Journal of Education and Management Engineering, Vol.11, No.2, pp. 32-39, 2021.

[3]Badrinarayanan V., Kendall A., Cipolla R. “SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation”. URL: (accessed October 12, 2022).

[4]Cun Y. Le, et al. “Learning Hierarchical Features for Scene Labeling”. URL: (accessed October 11, 2022).

[5]Dmitry A. “Segmentation Object Strategy on Digital Image”. Journal of Siberian Federal University. Engineering & Technologies. 2018. № 11(2), pp. 213-220.

[6]Dychka I., Chernyshev D., Tereikovskyi I., Tereikovska L., Pogorelov V. “Malware Detection Using Artificial Neural Networks”. Advances in Intelligent Systems and Computing, 2020. Vol. 938, pp. 3-12.

[7]Cherrat, Rachid Alaoui, Hassane Bouzahir. “Score Fusion of Finger Vein and Face for Human Recognition Based on Convolutional Neural Network Model”.  International Journal of Computing, 2020. 19(1), pp. 11-19.

[8]Forrest N. Iandola, et al. “SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size”. ArXiv1602.07360 Cs. (2016). 1602.07360 (accessed September 17, 2022).

[9]Andrew Howard, et al. “Searching for MobileNetV3”. ArXiv1905.02244 Cs. (2019). (accessed October 20, 2022).

[10]Hu Z., Tereikovskyi I., Chernyshev D., Tereikovska L., Tereikovskyi O., Wang D. “Procedure for Processing Biometric Parameters Based on Wavelet Transformations”. International Journal of Modern Education and Computer Science. 2021. Vol. 13, No 2, pp. 11-22.

[11]Hu Z., Tereykovskiy I., Zorin Y., Tereykovska L., Zhibek A. Optimization of convolutional neural network structure for biometric authentication by face geometry. Advances in Intelligent Systems and Computing. 2019. Vol. 754, pp. 567-577.

[12]Jun Shen. “Motion detection in color image sequence and shadow elimination”. Visual Communications and Image Processing. 2014. Vol. 5308, pp. 731-740.

[13]Kong T., et al. “FoveaBox: Beyound Anchor-Based Object Detection”, IEEE Trans. Image Process. 29 (2020), pp. 7389–7398.

[14]Liu, X.-P., Li, G., Liu, L., Wang, Z. “Improved YOLOV3 target recognition algorithm based on adaptive eged optimization”. Microelectron. Comput. 2019. Vol. 36, pp. 59–64.

[15]Prilianti, K. R., Anam, S., Brotosudarmo, T. H. P., Suryanto, A. “Non-destructive Photosynthetic Pigments Prediction using Multispectral Imagery and 2D-CNN”. International Journal of Computing. 2021. 20(3), pp. 391-399.

[16]Reja, S. A., Rahman, M. M. “Sports Recognition using Convolutional Neural Network with Optimization Techniques from Images and Live Streams”. International Journal of Computing, 2021. 20(2), pp. 276-285.

[17]Ronneberger O., Fischer P., Brox T. “U-Net: Convolutional Networks for Biomedical Image Segmentation”. Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015. Vol.9351, pp. 234-241.

[18]Senocak A. et al.  “Part-based player identification using deep convolutional representation and multi-scale pooling”. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018, pp. 1732-1739.

[19]Shkurat O. et al. “Image Segmentation Method Based on Statistical Parameters of Homogeneous Data Set”. Advances in Intelligent Systems and Computing. 2020. Vol. 902, pp. 271–281.

[20]Simonyan K., Zisserman A. “Very deep convolutional networks for large-scale image recognition”. ArXiv1409.1556 Cs. (2019). (accessed October 11, 2022).

[21]Taqi A., Awad A., Al-Azzo F., Milanova M. “The impact of multi-optimizers and data augmentation on TensorFlow convolutional neural network performance”. Proceedings of the 2018 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). 2018, pp. 140-145.

[22]Toliupa S., Tereikovskiy I., Dychka I., Tereikovska L., Trush A. “The Method of Using Production Rules in Neural Network Recognition of Emotions by Facial Geometry”. 3rd International Conference on Advanced Information and Communications Technologies. 2019, pp. 323-327.

[23]Wu C., Wen W., Afzal T., Zhang Y., Chen Y. “A compact DNN: Approaching GoogLeNet-Level accuracy of classification and domain adaptation”. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017.

[24]Yudin O., Toliupa S., Korchenko O., Tereikovska L., Tereikovskyi I., Tereikovskyi O. “Determination of Signs of Information and Psychological Influence in the Tone of Sound Sequences”. IEEE 2nd International Conference on Advanced Trends in Information Theory. 2020, pp. 276-280.

[25]Zhang S. et al. “Single-Shot Refinement Neural Network for Object Detection”. ArXiv 1711.06897 Cs. (2018). 1711.06897 (accessed October 16, 2022).