Enhancing Email Filtering Systems: Mitigating Adversarial Attacks with Advanced Defense Mechanisms

Authors

  • Shyam Kumar Kukunuri Undergraduate Student, Department of Information Technology, Mahatma Gandhi Institute of Technology (MGIT), Hyderabad, Telangana, India
  • Anirudh Bagavatula Undergraduate Student, Department of Information Technology, Mahatma Gandhi Institute of Technology (MGIT), Hyderabad, Telangana, India
  • Prem Kumar Chithaluru Assistant Professor, Department of Information Technology, Mahatma Gandhi Institute of Technology (MGIT), Hyderabad, Telangana, India

Keywords:

Adversarial attacks, Email classification, Ensemble learning, Machine learning, Robustness, Spam detection

Abstract

Email filtering systems, relying on Machine Learning (ML) and Natural Language Processing (NLP), are increasingly vulnerable to adversarial attacks, where malicious actors subtly alter email content to bypass detection. These attacks exploit weaknesses in current models, making it difficult for conventional filters to identify phishing or spam threats. Existing methods often fail to detect such adversarial examples, posing significant security risks. This paper proposes a robust email classification system capable of accurately distinguishing spam from legitimate (ham) emails, even under adversarial conditions. The system evaluates multiple traditional machines learning algorithms, including Random Forest, Support Vector Machine, Logistic Regression, and Naive Bayes, and compares their performance before and after adversarial attacks. Six types of text-based adversarial attacks are applied: character swap, homoglyph replacement, synonym substitution, word-level noise, backtranslation, and paraphrasing. Adversarial training considerably increases classification robustness, according to evaluation on the Enron email dataset. The suggested ensemble method mitigates the performance loss commonly seen in adversarial circumstances while maintaining excellent accuracy.

References

R. Agarwal, A. Dhoot, S. Kant, V. S. Bisht, H. Malik, M. F. Ansari, A. Afthanorhan, and M. A. Hossaini, "A novel approach for spam detection using natural language processing with AMALS models," IEEE Access, vol. 12, pp. 124298-124313, Apr. 18, 2024, doi: https://doi.org/10.1109/ACCESS.2024.3391023

E. G. Dada, J. S. Bassi, H. Chiroma, S. I. Abdulhamid, A. O. Adetunmbi, and O. E. Ajibuwa, "Machine learning for email spam filtering: Review, approaches and open research problems," Heliyon, vol. 5, no. 6, Jun. 2019, Art. no. e01802. doi: https://doi.org/10.1016/j.heliyon.2019.e01802

A. Karim, S. Azam, B. Shanmugam, K. Kannoorpatti, and M. Alazab, "A comprehensive survey for intelligent spam email detection," IEEE Access, vol. 7, pp. 168261-168295, Nov. 2019, doi: https://doi.org/10.1109/ACCESS.2019.2954791

Y. Li and Y. Wang, "Defense against adversarial attacks in deep learning," Appl. Sci., vol. 9, no. 1, p. 76, Dec. 2018. doi: https://doi.org/10.3390/app9010076

I. S. Mambina, J. D. Ndibwile, D. Uwimpuhwe, and K. F. Michael, "Uncovering SMS spam in Swahili text using deep learning approaches," IEEE Access, vol. 12, pp. 25164-25175, Feb. 2024, doi: https://doi.org/10.1109/ACCESS.2024.3365193

A. Kushwaha, K. Dutta, and V. Maheshwari, "Analysis of BERT email spam classifier against adversarial attacks," in Proc. 2023 International Conference on Artificial Intelligence and Smart Communication (AISC), Jan. 2023, pp. 485-490. doi: https://doi.org/10.1109/AISC56616.2023.10085255

S. A. Ghaleb, M. Mohamad, W. A. Ghanem, A. B. Nasser, M. Ghetas, A. M. Abdullahi, S. A. Saleh, H. Arshad, A. E. Omolara, and O. I. Abiodun, "Feature selection by multiobjective optimization: Application to spam detection system by neural networks and grasshopper optimization algorithm," IEEE Access, vol. 10, pp. 98475-98489, Sep. 2022, doi: https://doi.org/10.1109/ACCESS.2022.3204593

S. Tyagi, A. Agarwal, and G. Sharma, "Enhancing SMS classification with ensemble machine learning techniques," in Proc. 2024 IEEE International Conference on Communication, Computing and Signal Processing (IICCCS), Sep. 2024, pp. 1-6. doi: https://doi.org/10.1109/IICCCS61609.2024.10763775

Vasudevan, M. M. Nasurudeen, M. S. Kumar, R. D. Charaan, S. A. Kumar, and L. Jenefa, "Hierarchical model for email fraud detection using naïve Bayes and SVM," in Proc. International Conference for Women in Computing (InCoWoCo), Nov. 2024, pp. 1-6. doi: https://doi.org/10.1109/InCoWoCo64194.2024.10863458

G. Subhashini, G. Mahalakshmi, H. M. Ashik, and B. N. Duresh, "Advanced SMS spam detection using integrated feature extraction," in Proc. 4th International Conference on Ubiquitous Computing and Intelligent Information Systems (ICUIS), Dec. 2024, pp. 780-785. doi: https://doi.org/10.1109/ICUIS64676.2024.1086

Published

2025-07-24