Explainable AI for Predicting Malware Threat Levels: A SHAP-Enhanced Random Forest Approach

P. Devi  Sravanthi; Manas Kumar Yogi

Authors

P. Devi Sravanthi Post Graduate Student, Department of Computer Science and Engineering, Pragati Engineering College (A), Surampalem, Andhra Pradesh, India
Manas Kumar Yogi Assistant Professor, Department of Computer Science and Engineering, Pragati Engineering College (A), Surampalem, Andhra Pradesh, India

Keywords:

AI transparency, Cyber threat, Explainable AI (XAI), Machine learning, Random Forest, SHAP (SHapley Additive Explanations)

Abstract

This project presents an explainable AI (XAI) approach for predicting malware threat levels using a SHAP-enhanced Random Forest model. As malware detection becomes increasingly vital in cybersecurity, the need for transparent and interpretable models grows, especially in high-stakes environments. We address this by integrating SHAP (SHapley Additive exPlanations), a widely recognized explainability technique, with Random Forests—a robust ensemble learning method known for its strong predictive performance in classification tasks. The approach involves training the random forest model on a comprehensive dataset of malware features, where SHAP values are then used to provide detailed, human-understandable explanations for the model's predictions. These explanations identify the most significant features influencing the threat level assessments and offer insights into how different malware characteristics affect model decision-making. The model is evaluated using a benchmark malware dataset, achieving high prediction accuracy while maintaining interpretability. Comparative analysis against traditional black-box models, such as deep learning-based approaches, highlights the effectiveness of our method in balancing predictive performance and explainability. This transparency improves trust in AI-driven security systems, enabling security analysts to better interpret and act on predictions. Through experiments, we demonstrate that the SHAP-enhanced Random Forest approach maintains high predictive accuracy while significantly improving interpretability compared to traditional black-box models. The results show its potential for practical deployment in real-world cybersecurity applications, where both high accuracy and explainability are crucial for timely and informed decision-making in threat management.

References

N. Capuano, G. Fenza, V. Loia, and C. Stanzione, "Explainable artificial intelligence in cybersecurity: A survey," IEEE Access, vol. 10, pp. 93575–93600, Sep. 2022. https://doi.org/10.1109/ACCESS.2022.3204171.

S. Neupane, J. Ables, W. Anderson, S. Mittal, S. Rahimi, I. Banicescu, and M. Seale, "Explainable intrusion detection systems (X-IDS): A survey of current methods, challenges, and opportunities," IEEE Access, vol. 10, pp. 112392-112415, Oct. 2022. https://doi.org/10.1109/ACCESS.2022.3216617.

Tiwari, S., Shrestha, V., and Srivastava, A., "The Role of Explainable AI in Cybersecurity: Addressing Transparency Challenges in Autonomous Defense Systems," Int. J. Innov. Res. Sci. Eng. Technol., vol. 9, no. 3, pp. 718–733, 2020. https://www.ijirset.com/upload/2020/march/165_The.pdf.

M. MS, M. K. Hasan, R. Sulaiman, S. Islam, and A. U. Khan, “An explainable ensemble deep learning approach for intrusion detection in industrial Internet of Things,” IEEE Access, vol. 11, pp. 115047–115061, Oct. 2023. https://doi.org/10.1109/ACCESS.2023.3323573.

C. S. Wickramasinghe, K. Amarasinghe, D. L. Marino, C. Rieger, and M. Manic, "Explainable unsupervised machine learning for cyber-physical systems," IEEE Access, vol. 9, pp. 131824-131843, Sep. 2021. https://doi.org/10.1109/ACCESS.2021.3112397.

A. Kuppa and N. A. Le-Khac, "Adversarial XAI methods in cybersecurity," IEEE Transactions on Information Forensics and Security, vol. 16, pp. 4924-4938, Oct. 2021. https://doi.org/10.1109/TIFS.2021.3117075.

V. Chamola, V. Hassija, A. R. Sulthana, D. Ghosh, D. Dhingra, and B. Sikdar, "A review of trustworthy and explainable artificial intelligence (XAI)," IEEE Access, vol. 11, pp. 78994–79015, Jul. 2023. https://doi.org/10.1109/ACCESS.2023.3294569.

Dib, M., Torabi, S., Bou-Harb, E., & Assi, C. (2021). A multi-dimensional deep learning framework for IoT malware classification and family attribution. IEEE Transactions on Network and Service Management, 18(2), 1165-1177. https://doi.org/10.1109/TNSM.2021.3075315.

D. Saraswat, P. Bhattacharya, A. Verma, V. K. Prasad, S. Tanwar, G. Sharma, P. N. Bokoro, and R. Sharma, "Explainable AI for healthcare 5.0: Opportunities and challenges," IEEE Access, vol. 10, pp. 84486–84517, Aug. 2022. https://doi.org/10.1109/ACCESS.2022.3197671.

Yang, W., Wei, Y., Wei, H., Chen, Y., Huang, G., Li, X., Li, R., Yao, N., Wang, X., Gu, X., and Amin, M. B., "Survey on explainable AI: From approaches, limitations and applications aspects," Human-Centric Intelligent Systems, vol. 3, no. 3, pp. 161–188, Sep. 2023. https://link.springer.com/article/10.1007/s44230-023-00038-y.

Jagatheesaperumal, S. K., Pham, Q. V., Ruby, R., Yang, Z., Xu, C., & Zhang, Z. (2022). Explainable AI over the Internet of Things (IoT): Overview, state-of-the-art, and future directions. IEEE Open Journal of the Communications Society, 3, 2106-2136. https://doi.org/10.1109/OJCOMS.2022.3215676.

Theunissen, M. and Browning, J., "Putting explainable AI in context: institutional explanations for medical AI," Ethics and Information Technology, vol. 24, no. 2, p. 23, Jun. 2022. https://link.springer.com/article/10.1007/s10676-022-09649-8

C. Hwang and T. Lee, "E-SFD: Explainable sensor fault detection in the ICS anomaly detection system," IEEE Access, vol. 9, pp. 140470-140486, Oct. 2021. https://doi.org/10.1109/ACCESS.2021.3119573.

S. Poudyal and D. Dasgupta, "Analysis of crypto-ransomware using ML-based multi-level profiling," IEEE Access, vol. 9, pp. 122532-122547, Aug. 2021. https://doi.org/10.1109/ACCESS.2021.3109260.

A. H. Askr, E. Elgeldawi, H. Aboul Ella, Y. A. Elshaier, M. M. Gomaa, and A. E. Hassanien, “Deep learning in drug discovery: An integrative review and future challenges,” Artif. Intell. Rev., vol. 56, no. 7, pp. 5975–6037, Jul. 2023. https://link.springer.com/article/10.1007/s10462-022-10306-1.

K. Aryal, M. Gupta, M. Abdelsalam, P. Kunwar, and B. Thuraisingham, "A survey on adversarial attacks for malware analysis," IEEE Access, Dec. 18, 2024. https://doi.org/10.1109/ACCESS.2024.3519524.

Explainable AI for Predicting Malware Threat Levels: A SHAP-Enhanced Random Forest Approach

Authors

Keywords:

Abstract

References

Downloads

Published

Issue

Section

Current Issue