Malware Detection and Classification using Shapley Additive Explanations Values in Machine Learning

PDF (973KB), PP.18-32

Views: 0 Downloads: 0

Author(s)

Balachandra Chikkoppa 1,* Hanumanthappa J. 2 Wai Yie Leong 3

1. Department of Computer Science, KLE Institute of Technology, Gokul, Hubballi, 580028, Karnataka, India

2. DOS in CS, University of Mysore, Manasgongothri, Mysuru, 10587, Karnataka, India

3. INTI International University and Colleges, Nilai, Negeri Sembilan, Malaysia

* Corresponding author.

DOI: https://doi.org/10.5815/ijcnis.2026.01.02

Received: 16 Jul. 2025 / Revised: 18 Sep. 2025 / Accepted: 16 Nov. 2025 / Published: 8 Feb. 2026

Index Terms

Machine Learning, Malware, Sandbox, PE File, SHAP

Abstract

It is essential and unavoidable to detect Malware on the Internet, as a wide range of online IT services are available. Portable Executable files are the most frequently targeted platform by Malware. Malware must be promptly identified and alerted in a real-world environment by establishing a deployable learning system. The researchers applied machine learning to a Malware dataset, observing the model's performance metrics at a high computational cost, but were unable to deploy the model in a real-world environment. A deployable machine learning model using RF, attaining an accuracy of 97.16%, precision of 95.21%, and F1 score of 95.24% is achieved in the proposed research work, which is particularly adept at accurately identifying Malware. We have developed a novel classification model that employs the Support Vector Machine (SVM) to classify preprocessed data, detecting malware and normal instances. Furthermore, the SHAP technique identifies significant features, including SizeOfStackReserve, DllCharacteristics, and MajorImageVersion. The use of SHAP values facilitates an understanding of the characteristics of each feature in the model's prediction. Employing the SHAP algorithm using the trained SVM model to reduce the features, attained an accuracy of 97.16%.

Cite This Paper

Balachandra Chikkoppa, Hanumanthappa J, Wai Yie Leong, "Malware Detection and Classification using Shapley Additive Explanations Values in Machine Learning", International Journal of Computer Network and Information Security(IJCNIS), Vol.18, No.1, pp.18-32, 2026. DOI:10.5815/ijcnis.2026.01.02

Reference

[1]Gibert, D., “Machine Learning for Windows Malware Detection and Classification: Methods, challenges and ongoing research”, arXiv (Cornell University), 2024. DOI: https://doi.org/10.48550/arxiv.2404.18541.
[2]Syeda, D. Z., & Asghar, M. N., “Dynamic Malware classification and API categorization of Windows portable executable files using machine learning”, Applied Sciences, 14(3), 1015, 2024.DOI: https://doi.org/10.3390/app14031015
[3]Li, T., Shou, P., Wan, X., Li, Q., Wang, R., Jia, C., & Xiao, Y. “A fast Malware detection model based on heterogeneous graph similarity search. Computer Networks”, 110799, 2024. DOI: https://doi.org/10.1016/j.comnet.2024.110799
[4]Zhao, H., Wu, Y., Zou, D., Liu, Y., & Jin, H. “MalSensor: Fast and Robust Windows Malware Classification.” ACM Transactions on Software Engineering and Methodology” (2024). DOI: https://doi.org/10.1145/3688833
[5]Rudd, E. M., Krisiloff, D., Coull, S., Olszewski, D., Raff, E., & Holt, J. “Efficient Malware Analysis Using Metric Embeddings. Digital Threats Research and Practice”, 5(1), 1–20, 2024. https://doi.org/10.1145/3615669
[6]Al-Khshali, H. H., Ilyas, M., Sohrab, F., & Gabbouj, M. “Malware Detection with Subspace Learning-based One-Class Classification.” IEEE Access, 12, 81017–81029, 2024. https://doi.org/10.1109/access.2024.3409937
[7]Shaukat, K., Luo, S., & Varadharajan, V. “A novel machine learning approach for detecting first-time-appeared Malware. Engineering Applications of Artificial Intelligence”, 131, 107801 2024. https://doi.org/10.1016/j.engappai.2023.107801
[8]Tyagi, S., Baghela, A., Dar, K. M., Patel, A., Kothari, S., & Bhosale, S., “Malware Detection in PE files using Machine Learning.” OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON), 2023. https://doi.org/10.1109/otcon56053.2023.10113998
[9]Ngwobia, S. C., Ralescu, A., Kapp, D., & Kebede, T. “Detection of malicious PE files using synthesized DNA artifacts. Computers & Security”, 134, 103457 2023. https://doi.org/10.1016/j.cose.2023.103457
[10]Vashishtha, L. K., Chatterjee, K., & Rout, S. S. “An Ensemble Approach for Advanced Malware Memory Analysis Using Image Classification Techniques. Journal of Information Security and Applications”, 77, 103561, 2023. https://doi.org/10.1016/j.jisa.2023.103561
[11]Kamboj, A., Kumar, P., Bairwa, A. K., & Joshi, S. “Detection of Malware in Downloaded Files Using Various Machine Learning Models.” Egyptian Informatics Journal, 24(1), 81–94, 2022. https://doi.org/10.1016/j.eij.2022.12.002
[12]Fascí, L. S., Fisichella, M., Lax, G., & Qian, C. “Disarming visualization-based approaches in Malware detection systems”, Computers & Security, 126, 103062, 2022. https://doi.org/10.1016/j.cose.2022.103062
[13]Mimura, M. “Evaluation of printable character-based malicious PE file detection method. Internet of Things”, 19, 100521, 2022. https://doi.org/10.1016/j.iot.2022.100521
[14]Demirci, D., Sahin, N., Sirlancis, M., & Acarturk, C. Static Malware detection using stacked BILSTM and GPT-2. IEEE Access, 10, 58488–58502, 2022. https://doi.org/10.1109/access.2022.3179384