Enhancing Intrusion Detection for Minority Attack Classes: A SHAP-Based Feature Selection Approach with Deep Neural Networks

PDF (573KB), PP.50-62

Views: 0 Downloads: 0

Author(s)

Anagha A. S. 1,* Ciza Thomas 2 Sreelatha G. 3

1. College of Engineering Trivandrum, aff. to APJ Abdul Kalam Technological University, Trivandrum, 695016, India

2. Digital University Kerala, Trivandrum, 695317, Kerala, India

3. College of Engineering Trivandrum, Trivandrum, 695016, India

* Corresponding author.

DOI: https://doi.org/10.5815/ijwmt.2026.01.04

Received: 7 May 2025 / Revised: 20 Jul. 2025 / Accepted: 14 Sep. 2025 / Published: 8 Feb. 2026

Index Terms

Intrusion Detection Systems, Random Forest, Deep Neural Network, SHAP, User-to-root attack, Root-to-Local attack

Abstract

In the dynamic landscape of cybersecurity, safeguarding computer networks against persistent malicious threats is paramount. Intrusion Detection Systems are crucial in this context by monitoring network traffic for unau-thorized access. While the integration of Machine Learning and Deep Learning has significantly advanced intrusion detection, the persistent challenge lies in effectively detecting minority attack classes. This study introduces an innovative approach that combines SHapley Additive exPlanations(SHAP) for feature selection and Deep Neural Networks(DNN) to enhance the performance of intrusion detection systems, particularly focusing on minority attack classes in the NSL-KDD dataset. Applied to a Random Forest classifier using a balanced dataset, SHAP provides valuable insights into feature importance, refining the feature set for seamless integration into a DNN architecture. Employing the NSL-KDD dataset, the research concentrates on elevating the detection accuracy for User-to-Root attack and Root-to-Local attacks. The results showcase a notable improvement in performance along with a reduction in computational time compared to using all the available features. A key emphasis of the study is on detecting all attack types without compromising the F1-score. An in-depth analysis of the initial set of 41 features identifies 30 as crucial for effective intrusion detection. On the imbalanced dataset, SHAP-based feature reduction improved the overall F1-score in multiclass classification from 86% to 91% by reducing training time by 8.86%, confirming that SHAP can lower complexity without sacrificing accuracy. However, several minority attacks remained undetected due to their extremely low representation. Additional experiments with oversampled data confirm that SHAP continues to provide efficiency gains while enabling robust detection of rare attack classes. These findings demonstrate that SHAP-based feature selection improves efficiency in IDS and has strong potential for minority attack detection if data scarcity is addressed. This research not only contributes to the enhancement of IDS capabilities but also highlights the importance of meticulous feature selection in achieving comprehensive and efficient intrusion detection.

Cite This Paper

Anagha A. S., Ciza Thomas, Sreelatha G., "Enhancing Intrusion Detection for Minority Attack Classes: A SHAP-Based Feature Selection Approach with Deep Neural Networks", International Journal of Wireless and Microwave Technologies(IJWMT), Vol.16, No.1, pp. 50-62, 2026. DOI:10.5815/ijwmt.2026.01.04

Reference

[1]B. Mukherjee, L.T. Heberlein, and K.N. Levitt. Network intrusion detection. IEEE Network, 8(3):26–41, 1994.
[2]Chih-Fong Tsai, Yu-Feng Hsu, Chia-Ying Lin, and Wei-Yang Lin. Intrusion detection by machine learning: A review. Expert Systems with Applications, 36(10):11994–12000, December 2009.
[3]Mohamed Amine Ferrag, Leandros Maglaras, Sotiris Moschoyiannis, and Helge Janicke. Deep learning for cyber security intrusion detection: Approaches, datasets, and comparative study. Journal of Information Security and Applications, 50:102419, 2020.
[4]Christoph Molnar. Interpretable Machine Learning. Lean Pub, Germany, 2 edition, 2022.
[5]Scott M. Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. Explainable AI for Trees: From Local Explanations to Global Under-standing. arXiv:1905.04610 [cs, stat], May 2019. arXiv: 1905.04610.
[6]L Dhanabal and SP Shantharajah. A study on nsl-kdd dataset for intrusion detection system based on classification algorithms. International journal of advanced research in computer and communication engineering, 4(6):446–452, 2015.
[7]Rafa Alenezi and Simone A Ludwig. Explainability of cybersecurity threats data using shap. In 2021 IEEE Sympo-sium Series on Computational Intelligence (SSCI), pages 01–10. IEEE, 2021.
[8]Elike Hodo, Xavier Bellekens, Andrew Hamilton, Christos Tachtatzis, and Robert Atkinson. Shallow and deep networks intrusion detection system: A taxonomy and survey. arXiv preprint arXiv:1701.02145, 2017.
[9]Fadi Thabtah, Suhel Hammoud, Firuz Kamalov, and Amanda Gonsalves. Data imbalance in classification: Experi-mental evaluation. Information Sciences, 513:429–441, March 2020.
[10]Bartosz Krawczyk. Learning from imbalanced data: open challenges and future directions. Progress in Artificial Intelligence, 5(4):221–232, November 2016.
[11]Srinivas Mukkamala and Andrew H Sung. Feature selection for intrusion detection with neural networks and support vector machines. Transportation research record, 1822(1):33–39, 2003.
[12]Bekir Karlik. The positive effects of fuzzy c-means clustering on supervised learning classifiers. Int. J. Artif. Intell. Expert Syst.(IJAE), 7:1–8, 2016.
[13]Scott M. Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M. Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1):56–67, January 2020. Number: 1 Publisher: Nature Publishing Group.
[14]Raquel Rodr´ıguez-Pe´rez and Ju¨rgen Bajorath. Interpretation of machine learning models using shapley values: application to compound potency and multi-target activity predictions. Journal of computer-aided molecular design, 34:1013–1026, 2020.
[15]Wilson E. Marc´ılio and Danilo M. Eler. From explanations to feature selection: assessing shap values as feature selection mechanism. In 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 340–347, 2020.
[16]Anagha A.S., Ciza Thomas, and N. Balakrishnan. Optimized intrusion predictions through feature selection meth-ods. Computers & Security, 157:104541, 2025.
[17]Ravi Vinayakumar, Mamoun Alazab, KP Soman, Prabaharan Poornachandran, Ameer Al-Nemrat, and Sitalakshmi Venkatraman. Deep learning approach for intelligent intrusion detection system. Ieee Access, 7:41525–41550, 2019.
[18]Leo Breiman. Random forests. Machine Learning, 45:5–32, 2001.
[19]Gilles Louppe. Understanding Random Forests: From Theory to Practice. arXiv:1407.7502 [stat], June 2015. arXiv: 1407.7502.
[20]W. E. Marcilio and D. M. Eler. From explanations to feature selection: assessing shap values as feature selection mechanism. In 2020 33rd SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 340–347, Los Alamitos, CA, USA, nov 2020. IEEE Computer Society.
[21]Guofei Gu, Prahlad Fogla, David Dagon, Wenke Lee, and Boris Skoric´. Measuring intrusion detection capability: An information-theoretic approach. In Proceedings of the 2006 ACM Symposium on Information, computer and communications security, pages 90–101, 2006.
[22]Tuan A Tang, Lotfi Mhamdi, Des McLernon, Syed Ali Raza Zaidi, and Mounir Ghogho. Deep learning approach for network intrusion detection in software defined networking. In 2016 International Conference on Wireless Networks and Mobile Communications (WINCOM), pages 258–263, 2016.
[23]Ciza Thomas. Improving intrusion detection for imbalanced network traffic: Improving intrusion detection. Security and Communication Networks, 6(3):309–324, March 2013.
[24]N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16:321–357, June 2002.
[25]Yadigar Imamverdiyev and Fargana Abdullayeva. Deep learning in cybersecurity: Challenges and approaches. International Journal of Cyber Warfare and Terrorism, 10:82–105, 03 2020.