Phishing Attack Mitigation Through AI: A Review of Feature Engineering and Classification Techniques with Model Interpretability

Authors

  • Nayan Pundlik Postgraduate Student, Department of Computer Science and Engineering, LNCT Collage, Bhopal, Madhya Pradesh, India
  • Tripti Saxena Professor, Department of Computer Science and Engineering, LNCT Collage, Bhopal, Madhya Pradesh, India

Keywords:

Artificial intelligence, Cybersecurity, Dimensionality reduction, Ensemble method, Explainable AI (XAI), Feature engineering, LIME, Model interpretability, Phishing detection, SHAP, URL-based features

Abstract

Phishing, which targets unwary users with fake emails, URLs, and websites to steal sensitive data, remains one of the most sophisticated and persistent cybersecurity threats. Phishing attacks are becoming more complex and frequent, making traditional rule-based protection strategies inadequate. Artificial Intelligence (AI) has emerged as a powerful alternative, providing adaptive, scalable, and precise detection methods. A comprehensive review of AI-driven phishing mitigation techniques emphasizes three key areas: feature engineering, classification methods, and model interpretability. The study examines commonly used features, including URL-based, domain-based, content-based, and behavioral features, along with advanced feature selection and dimensionality reduction strategies such as PCA, LDA, and autoencoders. It discusses both traditional machine learning classifiers SVM, Random Forest, k-NN and deep learning architectures like CNN, RNN, and LSTM, as well as ensemble techniques such as XGBoost and AdaBoost. Recognizing the complexity of AI models, it highlights interpretability techniques like SHAP, LIME, and inherent model transparency to improve trustworthiness and explainability in phishing detection.

References

M. Abdolrazzagh-Nezhad and N. Langarib, “Phishing Detection Techniques: A review,” Data Science: Journal of Computing and Applied Informatics, vol. 9, no. 1, pp. 32–46, Jan. 2025, doi: https://doi.org/10.32734/jocai.v9.i1-19904.

D. Patel, “Leveraging Blockchain and AI Framework for Enhancing Intrusion Prevention and Detection in Cybersecurity,” Technix International Journal for Engineering Research, vol. 10, no. 6, 2023, doi: https://doi.org/10.56975/tijer.v10i6.158517.

S. Asiri, Y. Xiao, S. Alzahrani, S. Li, and T. Li, “A Survey of Intelligent Detection Designs of HTML URL Phishing Attacks,” IEEE Access, pp. 1–1, 2023, doi: https://doi.org/10.1109/access.2023.3237798.

E. S. Gualberto, R. T. De Sousa, T. P. De B. Vieira, J. P. C. L. Da Costa, and C. G. Duque, “From Feature Engineering and Topics Models to Enhanced Prediction Rates in Phishing Detection,” IEEE Access, vol. 8, pp. 76368–76385, 2020, doi: https://doi.org/10.1109/access.2020.2989126.

R. Patel, “Automated Threat Detection and Risk Mitigation for ICS (Industrial Control Systems) Employing Deep Learning in Cybersecurity Defence,” 584| International Journal of Current Engineering and Technology, vol. 13, no. 6, 2023, doi: https://doi.org/10.14741/ijcet/v.13.6.11.

A. Ozcan, C. Catal, E. Donmez, and B. Senturk, “A hybrid DNN–LSTM model for detecting phishing URLs,” Neural Computing and Applications, Aug. 2021, doi: https://doi.org/10.1007/s00521-021-06401-z.

A. Basit, M. Zafar, X. Liu, A. R. Javed, Z. Jalil, and K. Kifayat, “A comprehensive survey of AI-enabled phishing attacks detection techniques,” Telecommunication Systems, vol. 76, no. 1, Oct. 2020, doi: https://doi.org/10.1007/s11235-020-00733-2.

R. Kiruthiga and D. Akila, “Phishing Websites Detection using Machine Learning,” International Journal of Recent Technology and Engineering, vol. 8, no. 2S11, pp. 111–114, Nov. 2019, doi: https://doi.org/10.35940/ijrte.b1018.0982s1119.

K. Chahar and Dr. F. Prakash, “Enhancing Cyber Threat Detection Through Big Data Analytics and ChatGPT,” International Journal of Research Publication and Reviews, vol. 5, no. 3, pp. 4141–4148, Mar. 2024, doi: https://doi.org/10.55248/gengpi.5.0324.07100.

R. Muntode and S. Parwe, “An Overview on Phishing-its types and Countermeasures,” International Journal of Engineering Research & Technology, 2019. Available: https://pdfs.semanticscholar.org/6940/19bc2e2fdea1f62975fa80915befb48cfb28.pdf

R. O. Ogundokun, P. O. Sadiku, A. T. Abdulahi, A. Babatunde, and O. I. D, “A Review on Phishing Attacks: Types, Prevention Measures and Detection Features.,” in Creative Research Publishers, Creative Research Publishers, 2022, p. Pp 217-222. doi: https://doi.org/10.22624/aims/bk2022-P36.

I. Ayoub, M. S. Lenders, B, Ampeau, “Understanding IoT Domain Names: Analysis and Classification Using Machine Learning,” arXiv.org, 2024. https://arxiv.org/abs/2404.15068

M. Almuhaideb, “Homoglyph Attack Detection Model Using Machine Learning and Hash Function,” Journal of Sensor and Actuator Networks, vol. 11, no. 3, p. 54, Sep. 2022, doi: https://doi.org/10.3390/jsan11030054.

D. P. and S. Mary Susila A, “A Client-Side PhishCatcher against Web Spoofing Attacks,” International Journal of Creative Research Thoughts (IJCRT), 2024. [Online]. Available: https://www.ijcrt.org/papers/IJCRT24A3052.pdf

F. Heiding, B. Schneier, A. Vishwanath, J. Bernstein, and P. S. Park, “Devising and detecting phishing emails using large language models,” IEEE Access, vol. 12, pp. 1–1, Jan. 2024, doi: https://doi.org/10.1109/access.2024.3375882.

A. Daeef, R. Badlishah Ahmad, Y. Yacob, N. Yaakob, K. Nurul, and F. Ku Azir, “Multi Stage Phishing Email Classification,” Journal Of Theoretical And Applied Information Technology, Vol. 83, No. 2, 2016, Available: https://www.jatit.org/volumes/Vol83No2/7Vol83No2.pdf

J. Osamor, M. Ashawa, A. Shahrabi, A. Philip, and C. Iwendi, “The Evolution of Phishing and Future Directions: A Review,” International Conference on Cyber Warfare and Security, vol. 20, no. 1, pp. 361–368, Mar. 2025, doi: https://doi.org/10.34190/iccws.20.1.3366.

N. K. Prajapati, “Federated Learning for Privacy-Preserving Cybersecurity: A Review on Secure Threat Detection,” International Journal of Advanced Research in Science, Communication and Technology, vol. 5, no. 4, pp. 520–528, Apr. 2025, doi: https://doi.org/10.48175/ijarsct-25168.

H. Kali, “The Future of Hr Cybersecurity: Ai-Enabled Anomaly Detection in Workday,” International Journal of Recent Technology Science & Management 2023. Available: https://ijrtsm.com/wp-content/uploads/2025/05/2023-June-2023-Honie-80-88.pdf

T. A. Bandahala, N. S. Suhaili, K. A. Monabi, “The Role of Artificial Intelligence in Detecting and Preventing Phishing Emails,” Zenodo, Jan. 2025, doi: https://doi.org/10.5281/zenodo.14621440.

A. U. Z. Asif, H. Shirazi, and I. Ray, “Machine Learning-Based Phishing Detection Using URL Features: A Comprehensive Review,” International Symposium on Stabilizing, Safety, and Security of Distributed Systems, vol. 14310, pp. 481–497, Jan. 2023, doi: https://doi.org/10.1007/978-3-031-44274-2_36

S. Hamadouche, O. Boudraa, and M. Gasmi, “Combining Lexical, Host, and Content-based features for Phishing Websites detection using Machine Learning Models,” ICST Transactions on Scalable Information Systems, Apr. 2024, doi: https://doi.org/10.4108/eetsis.4421.

Y. Bee Wah, N. Ibrahim, and H. Abdul Hamid, “Feature Selection Methods: Case of Filter and Wrapper Approaches for Maximising Classification Accuracy,” Pertanika Journal of Science & Technology, 2018. http://www.pertanika.upm.edu.my/pjst/browse/regular-issue?article=JST-S0296-2017

Z. M. Radeef, S. H. Hashem, and E. K. Gbashi, “New Feature Selection Using Principal Component Analysis,” Journal of Soft Computing & Computer Applications, vol. 1, no. 2, Dec. 2024, doi: https://doi.org/10.70403/3008-1084.1012.

L. Wang and C. A. Alexander, “Machine Learning in Big Data,” International Journal of Mathematical, Engineering and Management Sciences, vol. 1, no. 2, pp. 52–61, Sep. 2016, doi: https://doi.org/10.33889/ijmems.2016.1.2-006

S. Salloum, T. Gaber, S. Vadera, and K. Shaalan, “A Systematic Literature Review on Phishing Email Detection Using Natural Language Processing Techniques,” IEEE Access, vol. 10, pp. 65703–65727, 2022, doi: https://doi.org/10.1109/access.2022.3183083.

E. ul H. Qazi, M. H. Faheem, and I. Ahmad, “Detecting Phishing URLs Based on a Deep Learning Approach to Prevent Cyber-Attacks,” Applied Sciences, vol. 14, no. 22, pp. 10086–10086, Nov. 2024, doi: https://doi.org/10.3390/app142210086.

Z. Fan, W. Li, Kathryn Blackmond Laskey, and K.-C. Chang, “Investigation of Phishing Susceptibility with Explainable Artificial Intelligence,” Future Internet, vol. 16, no. 1, pp. 31–31, Jan. 2024, doi: https://doi.org/10.3390/fi16010031.

H. M. U. Akhtar, M. Nauman, N. Akhtar, M. Hameed, S. Hameed, and M. Z. Tareen, “Mitigating Cyber Threats: Machine Learning and Explainable AI for Phishing Detection,” VFAST Transactions on Software Engineering, vol. 13, no. 2, pp. 170–195, Jun. 2025, doi: https://doi.org/10.21015/vtse.v13i2.2129.

S. Ahmad et al., “Across the Spectrum In-Depth Review AI-Based Models for Phishing Detection,” IEEE Open Journal of the Communications Society, pp. 1–1, Jan. 2024, doi: https://doi.org/10.1109/ojcoms.2024.3462503

D. Raj, R. Kumar, and S. Joshi, “Automated AI System for Online Phishing Detection and Mitigation,” pp. 1–6, Aug. 2024, doi: https://doi.org/10.1109/iceect61758.2024.10739139.

V. Pavani, D. Mahitha, and B. U. Maheswari, “Enhancing Online Safety: Phishing URL Detection Using Machine Learning and Explainable AI,” 2022 13th International Conference on Computing Communication and Networking Technologies (ICCCNT), pp. 1–6, Jun. 2024, doi: https://doi.org/10.1109/icccnt61001.2024.10723976.

S. Kapan and E. Sora Gunal, “Improved Phishing Attack Detection with Machine Learning: A Comprehensive Evaluation of Classifiers and Features,” Applied Sciences, vol. 13, no. 24, p. 13269, Jan. 2023, doi: https://doi.org/10.3390/app132413269.

A. Al-Sabbagh, K. Hamze, S. Khan, and M. Elkhodr, “An Enhanced K-Means Clustering Algorithm for Phishing Attack Detections,” Electronics, vol. 13, no. 18, pp. 3677–3677, Sep. 2024, doi: https://doi.org/10.3390/electronics13183677.

B. Naqvi, K. Perova, A. Farooq, I. Makhdoom, S. Oyedeji, and J. Porras, “Mitigation Strategies against the Phishing Attacks: A Systematic Literature Review,” Computers & Security, vol. 132, p. 103387, Jul. 2023, doi: https://doi.org/10.1016/j.cose.2023.103387.

F. Ansari, A. Panigrahi, G. Jakka, A. Pati, and K. Bhattacharya, “Prevention of Phishing attacks using AI Algorithm,” IEEE Xplore, Nov. 01, 2022. https://ieeexplore.ieee.org/document/10010185/

R. Abdillah, Z. Shukur, M. Mohd, and Ts. M. Z. Murah, “Phishing Classification Techniques: A Systematic Literature Review,” IEEE Access, vol. 10, pp. 41574–41591, 2022, doi: https://doi.org/10.1109/access.2022.3166474.

Published

2025-09-11

Issue

Section

Articles