IJCNIS Vol. 18, No. 2, 8 Apr. 2026
Cover page and Table of Contents: PDF (size: 1023KB)
PDF (1023KB), PP.1-18
Views: 0 Downloads: 0
Cloud-native Architecture, Artificial Intelligence, Infrastructure Optimization, Anomaly Detection, Kubernetes, Machine Learning, AIOps
The article describes a model of cloud-native AI pipelines designed for continuous optimization of computing infrastructure and real-time anomaly detection. The developed model combines modern approaches to observability, machine learning (ML), and auto-scaling based on load forecasting. The methodology is based on the use of LSTM models, autoencoders, and convolutional neural networks (CNN) integrated into Kubernetes environment with support for Prometheus, Kafka, and Grafana. Load changes are simulated, and the system's response to critical events is evaluated. The results demonstrate a significant improvement in anomaly detection accuracy (up to 93%) and resource efficiency (up to 26% cost reduction compared to traditional approaches). The proposed model can be used in AIOps systems that require a high level of automation and reliability.
Viktor Vyshnivskyi, Vadym Mukhin, Olha Zinchenko, Vitalii Kotelianets, Oleksandr Zvenihorodskyi, Pavlo Kudrynskyi, Oleksandr Vyshnivskyi, "Cloud-native AI Pipelines for Continuous Infrastructure Optimization and Anomaly Detection", International Journal of Computer Network and Information Security(IJCNIS), Vol.18, No.2, pp.1-18, 2026. DOI:10.5815/ijcnis.2026.02.01
[1]Vu Dinh-Dai, Tran Minh-Ngoc, Kim Younghan (2022) "Predictive Hybrid Autoscaling for Containerized Applications." IEEE Access, vol. 10, pp. 109768–109778. IEEE. https://doi.org/10.1109/ACCESS.2022.3214985
[2]Bibal Benifa J.V., Dejey D. (2019) "RLPAS: Reinforcement Learning-based Proactive Auto-Scaler for Resource Provisioning in Cloud Environment." Mobile Networks and Applications, vol. 24, pp. 1348–1363. https://doi.org/10.1007/s11036-018-0996-0
[3]Bacanin N., Simic V., Zivkovic M., et al. (2023) "Cloud computing load prediction by decomposition reinforced attention long short-term memory network optimized by modified particle swarm optimization algorithm." Annals of Operations Research. 47(2), 40-50 https://doi.org/10.1007/s10479-023-05745-0
[4]Naderi M., Momeni H., Shahini S. (2024) "A Graph Attention-Based Autoencoder for Critical Path Anomaly Detection in Microservices." Proceedings of the 15th International Conference on Information and Knowledge Technology (IKT 2024). IEEE. https://doi.org/10.1109/IKT65497.2024.10892730
[5]Oyeniran O.C., Adewusi A.O., Adeleke A.G., Akwawa L.A., Azubuko C.F. (2023) "AI-driven DevOps: Leveraging Machine Learning for Automated Software Deployment and Maintenance." Engineering Science & Technology Journal, vol. 4, no. 6, pp. 728–740. Fair East Publishers. https://doi.org/10.51594/estj.v4i6.1552
[6]Microsoft Azure Monitor Docs. (2021). "Metrics-based anomaly detection." Retrieved from https://learn.microsoft.com/
[7]Krishna M. Yashwanth Sai, Gawre S.K. (2023) "MLOps for Enhancing the Accuracy of Machine Learning Models using DevOps, Continuous Integration, and Continuous Deployment." Research Reports on Computer Science, vol. 2, no. 3, pp. 97–103. https://doi.org/10.37256/rrcs.2320232644
[8]Sarker, I. H. (2021). "Machine Learning: Algorithms, Real-World Applications and Research Directions." SN Computer Science, 2(3), 160. https://doi.org/10.1007/s42979-021-00592-x
[9]Truong T.H., Ta P.B., Kieu N.H., Nguyen V.H., Nguyen X.H., Nguyen T.H. (2022) "Federated Learning-Based Explainable Anomaly Detection for Industrial Control Systems." IEEE Access, vol. 10, pp. 53854–53872. IEEE. https://doi.org/10.1109/ACCESS.2022.3173288
[10]Sgambelluri A., Pacini A., Paolucci F., Castoldi P., Valcarenghi L. (2021) "Reliable and Scalable Kafka-Based Framework for Optical Network Telemetry." Journal of Optical Communications and Networking, vol. 13, no. 10, pp. E42–E52. https://doi.org/10.1364/JOCN.424639
[11]Aramide O.O. (2025) "Explainable AI (XAI) for Network Operations and Troubleshooting." International Journal for Research Publication and Seminar, vol. 16, no. 1, pp. 535–556. https://doi.org/10.36676/jrps.v16.i1.286
[12]Moosavi S., Farajzadeh-Zanjani M., Razavi-Far R., Palade V., Saif M. (2024) "Explainable AI in Manufacturing and Industrial Cyber–Physical Systems: A Survey." Electronics, vol. 13, no. 17, 3497. https://doi.org/10.3390/electronics13173497
[13]Gill S.S., Golec M., Hu J., et al. (2025) "Edge AI: A Taxonomy, Systematic Review and Future Directions." Cluster Computing, vol. 28, 18. Springer. https://doi.org/10.1007/s10586-024-04686-y
[14]Polamarasetti A. (2024) "Machine Learning Techniques Analysis to Efficient Resource Provisioning for Elastic Cloud Services." Proceedings of the 2024 International Conference on Intelligent Computing and Emerging Communication Technologies (ICEC 2024), pp. 323–348. IEEE. https://doi.org/10.1109/ICEC59683.2024.10837344
[15]Flower Framework. (2022). Federated Learning with Flower. https://flower.dev
[16]Zhou Y., Yu Y., Ding B. (2020) "Towards MLOps: A Case Study of ML Pipeline Platform." Proceedings of the 2020 International Conference on Artificial Intelligence and Computer Engineering (ICAICE 2020), pp. 486–502. IEEE. https://doi.org/10.1109/ICAICE51518.2020.00102
[17]Jangam S.K., Pedda Muntala P.S.R. (2023) "Challenges and Solutions for Managing Errors in Distributed Batch Processing Systems and Data Pipelines." International Journal of Emerging Research in Engineering and Technology, vol. 4, no. 4. https://doi.org/10.63282/3050-922X.IJERET-V4I4P107
[18]Chaudhari B., Kabade S. (2023) "Architecting Event-Driven Microservices: Unlocking Asynchronous Communication with Kafka and RabbitMQ." International Journal of All Research Education and Scientific Methods (IJARESM), vol. 11, no. 3. www.ijaresm.com
[19]Patchamatla P.S. (2018) "Optimizing Kubernetes-Based Multi-Tenant Container Environments in OpenStack for Scalable AI Workflows." International Journal of Advanced Research in Education and Technology (IJARETY), vol. 5, no. 3, May–June 2018. https://doi.org/10.15680/IJARETY.2018.0503002
[20]ISO/IEC 27001:2022 – Information Security Management Systems. International Organization for Standardization.
[21]Potts W.C., Carver C. (2024) "Best Practices Implementing AIOps in Large Organizations." Proceedings of the 2024 International Conference on Smart Applications, Communications and Networking (SmartNets 2024), 28–30 May 2024. IEEE. https://doi.org/10.1109/SmartNets61466.2024.10577643
[22]GCP Documentation. (2022). Best practices for scaling AI workloads with Kubernetes. https://docs.cloud.google.com/kubernetes-engine/docs/best-practices/machine-learning/inference/autoscaling
[23]Dodonov, A., Mukhin, V., Zavgorodnii, V., Kornaga, Ya., Zavgorodnya A. (2021). "Method of searching for information objects in unified information space. System research and information technologies." N1. P. 34–46. DOI: https://doi.org/10.20535/SRIT.2308-8893.2021.1.03.
[24]Mukhin, V. "The security mechanisms for grid computers." Proceedings of the 4-th IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS’2007), Dortmund, Germany, 6–8 September 2007, pp. 584–589. DOI: 10.1109/IDAACS.2007.4488488
[25]Zhengbing, H., Mukhin, V., Kornaga, Y., Volokyta, A., Herasymenko, O. (2017) The scheduler for distributed computer systems based on the network centric approach to resources control. 9th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). Bucharest. Romania. P.518–523. DOI: https://doi.org/10.1109/IDAACS.2017.8095135.