Anti-phishing Frameworks for Safer Digital Ecosystems: A Systematic Review
Abstract
The digital ecosystem faces an unprecedented wave of phishing attacks, with over 1.9 million detected incidents in 2024 alone. This systematic review analyzes 30 research papers on phishing detection techniques, spanning classical machine learning to emerging LLM-based systems. A comprehensive taxonomy of detection methods is presented and Random Forest and ensemble methods as current accuracy leaders (up to 99.96%) are identified. Beyond the literature review, empirical validation by implementing two top-performing models Random Forest (Paper 1) and XGBoost (Paper 15) on the largest publicly available dataset (235,795 URLs) is provided.
Key results show that Random Forest achieves 99.9215% accuracy with 77.07/100 adversarial robustness. XGBoost achieves 99.9555% accuracy with a throughput of over 1.5 million samples/sec. Both models surpass or nearly match their original paper baselines on 20× larger data, even after rigorous data leakage removal of 9 suspicious features.
References
A. A. Albishri and M. M. Dessouky, “A comparative analysis of machine learning techniques for URL phishing detection,” Engineering, Technology & Applied Science Research, vol. 14, no. 6, pp. 18495–18501, Dec. 2024.
H. Kim, J. H. Kim, J. Son, J. Song, and E. Lee, “A framework for mining collectively-behaving bots in MMORPGs,” in Lecture Notes in Computer Science, Dec. 2024, pp. 400–419.
F. T. Johora, M. S. I. Khan, E. Kanon, M. A. T. Rony, M. Zubair, and I. H. Sarker, “A data-driven predictive analysis on cyber security threats with key risk factors,” arXiv, Mar. 2024.
S. N. Min, N. Fazlida, and M. Sani, “Message conversation based social engineering attack detection using machine learning,” Journal of Theoretical and Applied Information Technology, vol. 103, no. 14, 2025.
A. Alghuried et al., “Comprehensive evaluation of adversarial perturbations against ML-based Ethereum phishing detection systems,” Distributed Ledger Technologies: Research and Practice, Oct. 2025.
M. A. Taha, H. D. A. Jabar, and W. K. Mohammed, “A machine learning algorithm for detecting phishing websites: A comparative study,” Iraqi Journal for Computer Science and Mathematics, vol. 5, no. 3, pp. 275–286, Jul. 2024.
R. Alzubi, T. Bishtawi, and H. Kassem, “Improving web security through machine learning: A feature-based methodology for detecting phishing URLs,” Engineering, Technology & Applied Science Research, vol. 15, no. 5, pp. 26845–26851, Oct. 2025.
S. K. Ahmad, B. A. Dapshima, and Y. C. Essa, “Detection of phishing attacks using machine learning techniques,” International Research Journal of Modernization in Engineering Technology and Science, vol. 6, no. 7, Jul. 2024.
C. Lee, “Enhancing phishing email identification with large language models,” arXiv, 2025.
Y. Xue, E. Spero, Y. S. Koh, and G. Russello, “MultiPhishGuard: An LLM-based multi-agent system for phishing email detection,” arXiv, 2025.
W. Guo, Q. Wang, H. Yue, H. Sun, and R. Q. Hu, “Efficient phishing URL detection using graph-based machine learning and loopy belief propagation,” arXiv, 2025.
A. Newaz, F. S. Haq, and N. Ahmed, “A sophisticated framework for the accurate detection of phishing websites,” arXiv, Mar. 2024.
R. Jayaprakash et al., “Heuristic machine learning approaches for identifying phishing threats across web and email platforms,” Frontiers in Artificial Intelligence, vol. 7, Oct. 2024.
M. K. H. Chy, “Securing the web: Machine learning’s role in predicting and preventing phishing attacks,” International Journal of Science and Research Archive, vol. 13, no. 1, pp. 1004–1011, Sep. 2024.
S. Kristiansen and A. Jensen, “Victimization in online gaming-related trade scams: A study among young Danes,” Nordic Journal of Criminology, vol. 24, no. 2, pp. 1–17, Aug. 2023.
A. Stoica, “Social engineering as the new deception game,” Revista Română de Informatică și Automatică, vol. 31, no. 3, pp. 57–68, Oct. 2021.
E. A. Aldakheel, M. Zakariah, G. A. Gashgari, F. A. Almarshad, and A. I. A. Alzahrani, “A deep learning-based innovative technique for phishing detection in modern security with uniform resource locators,” Sensors, vol. 23, no. 9, Art. no. 4403, Apr. 2023.
S. Aslam, H. Aslam, A. Manzoor, H. Chen, and A. Rasool, “AntiPhishStack: LSTM-based stacked generalization model for optimized phishing URL detection,” Symmetry, vol. 16, no. 2, Art. no. 248, Feb. 2024.
N. Altwaijry, I. Al-Turaiki, R. Alotaibi, and F. Alakeel, “Advancing phishing email detection: A comparative study of deep learning models,” Sensors, vol. 24, no. 7, Art. no. 2077, Mar. 2024.
K. H. Park, E. Lee, and H. Kim, “Cashflow tracing: Detecting online game bots leveraging financial analysis with recurrent neural networks,” in Proc. ACM SIGCHI Annual Symposium on Computer-Human Interaction in Play, Nov. 2022.
H. Jabbar and S. Al-Janabi, “AI-driven phishing detection: Enhancing cybersecurity with reinforcement learning,” Journal of Cybersecurity and Privacy, vol. 5, no. 2, Art. no. 26, May 2025.
M. A. Uddin and I. H. Sarker, “An explainable transformer-based model for phishing email detection: A large language model approach,” arXiv, Feb. 2024.
W. Huang et al., “EvoMail: Self-evolving cognitive agents for adaptive spam and phishing email defense,” arXiv, 2025.
A. Alturki, N. Alshwihi, and A. Algarni, “Factors influencing players’ susceptibility to social engineering in social gaming networks,” IEEE Access, vol. 8, pp. 97383–97391, 2020.
R. Mahajan and I. Siddavatam, “Phishing website detection using machine learning algorithms,” International Journal of Computer Applications, vol. 181, no. 23, pp. 45–47, Oct. 2018.
F. Ji et al., “Evaluating the effectiveness and robustness of visual similarity-based phishing detection models,” arXiv, 2024.
M. S. I. Ovi et al., “PhishGuard: A multi-layered ensemble model for optimal phishing website detection,” in Proc. 6th Int. Conf. Sustainable Technologies for Industry 5.0 (STI), 2024.
D. H. Kulal, C. P. Arannonu, A. Anwar, N. Rastogi, and Q. Niyaz, “Robust ML-based detection of conventional, LLM-generated, and adversarial phishing emails using advanced text preprocessing,” arXiv, 2025.
F. Song, Y. Lei, S. Chen, L. Fan, and Y. Liu, “Advanced evasion attacks and mitigations on practical ML-based phishing website classifiers,” International Journal of Intelligent Systems, vol. 36, no. 9, pp. 5210–5240, Jun. 2021.
H. Nakano, T. Koide, and D. Chiba, “PhishParrot: LLM-driven adaptive crawling to unveil cloaked phishing sites,” in Proc. IEEE Global Communications Conf. (GLOBECOM), Aug. 2025.