Comparing BERT’s Performance Against CNN, RNN, and LSTM in Spam Detection

Authors

  • Sahar A. Hussein Altaee

Keywords:

Bidirectional Encoder Representations from Transformers (BERT) , Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Natural Language Processing (NLP), Recurrent Neural Network (RNN)

Abstract

With the growing popularity of online social networks, such as Twitter, spammers are exploiting these platforms by sending annoying messages, making it easier for attackers to access and carry out malicious activities. Spam is a form of platform manipulation and refers to actions intended to negatively impact users’ experience on Twitter, including unwanted or repetitive activities. Spam can include malicious automation and other platform manipulations, such as fake accounts, making spam detection extremely important. Recent advances in Natural Language Processing (NLP) demonstrate that transfer learning using pre-trained models outperforms traditional deep learning models. Most studies have focused on large datasets. This paper focuses on a real-world scenario often encountered by researchers in academic and industrial settings. Given a small dataset, a pre-trained model like BERT to can be leveraged to achieve better results compared to simpler models. We used the Twitter dataset from Kaggle, and the results showed that BERT outperformed CNN, RNN, and LSTM models on this dataset.

References

A. Ezen-Can, “A comparison of LSTM and BERT for small corpus,” arXiv:2009.05451, Sep. 2020, doi: https://doi.org/10.48550/arXiv.2009.05451

J. Deng, W. Dong, R. Socher, L. -J. Li, Kai Li and Li Fei-Fei, "ImageNet: A large-scale hierarchical image database," 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 2009, pp. 248–255, doi: https://doi.org/10.1109/CVPR.2009.5206848

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," in Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers), Jun. 2019, pp. 4171–4186. doi: https://doi.org/10.18653/v1/N19-1423

Y. Liu et al., "Roberta: A robustly optimized bert pretraining approach," arXiv.org, Jul. 2019, doi: https://doi.org/10.48550/arXiv.1907.11692

M. E. Peters et al., “Deep contextualized word representations,” arXiv.org, 2018, doi: https://doi.org/10.48550/arXiv.1802.05365

A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, "Improving language understanding by generative pre-training," 2018. Available: https://www.semanticscholar.org/paper/Improving-Language-Understanding-by-Generative-Radford-Narasimhan/cd18800a0fe0b668a1cc19f2ec95b5003d0a5035

V. Sanh, L. Debut, J. Chaumond, and T. Wolf, "DistilBERT, a distilled version of BERT: Smaller, faster, cheaper and lighter," arXiv.org, 2019, doi: https://doi.org/10.48550/arXiv.1910.01108

A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. Bowman, "GLUE: A multi-task benchmark and analysis platform for natural language understanding," in Proceedings of the 2018 EMNLP workshop BlackboxNLP: Analyzing and interpreting neural networks for NLP, Nov. 2018, pp. 353–355, doi: https://doi.org/10.48550/arXiv.1804.07461

Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, “XLNet: Generalized autoregressive pretraining for language understanding,” arXiv.org, Jan. 2020, doi: https://doi.org/10.48550/arXiv.1906.08237

Y. Lee, J. Saxe, and R. Harang, "CATBERT: Context-aware tiny BERT for detecting social engineering emails," arXiv.org, 2020, doi: https://doi.org/10.48550/arXiv.2010.03484

H. Wang, J. Wang, Z. Feng, C. W. Yu, and S.-J. Cao, "Optimization of ventilation performance of side air supply for large indoor spaces using deflectors and slot air outlets," Indoor Built Environment, vol. 32, no. 2, pp. 323–342, Jun. 2022, doi: https://doi.org/10.1177/1420326X221108587

H. Gupta, M. S. Jamal, S. Madisetty, and M. S. Desarkar, "A framework for real-time spam detection in Twitter," in 2018 10th international conference on communication systems & networks (COMSNETS), 2018, pp. 380–383, doi: https://doi.org/10.1109/COMSNETS.2018.8328222

A. R. Chrismanto, A. K. Sari, and Y. Suyanto, "Critical evaluation on spam content detection in social media," Journal of Theoretical and Applied Information Technology, vol. 100, no. 8, pp. 2642–2667, Apr. 2022. Available: https://www.jatit.org/volumes/Vol100No8/29Vol100No8.pdf

T. H. Borkar and T. Ahuja, "Comparative study of supervised learning algorithms for fake news classification," 2022 6th International Conference on Trends in Electronics and Informatics (ICOEI), Tirunelveli, India, 2022, pp. 1405–1411, doi: https://doi.org/10.1109/ICOEI53556.2022.9777118

Z. Miller, B. Dickinson, W. Deitrick, W. Hu, and A. H. Wang, "Twitter spammer detection using data stream clustering," Information Sciences, vol. 260, pp. 64–73, Mar. 2014, doi: https://doi.org/10.1016/j.ins.2013.11.016

A. H. Wang, “Machine learning for the detection of spam in Twitter networks,” in Communications in Computer and Information Science, Berlin, Heidelberg: Springer , 2012, pp. 319–333. doi: https://doi.org/10.1007/978-3-642-25206-8_21

S. B. S. Ahmad, M. Rafie, S. M. Ghorabie, "Spam detection on Twitter using a support vector machine and users’ features by identifying their interactions," Multimedia Tools and Applications, vol. 80, pp. 11583–11605, 2021. doi: https://doi.org/10.1007/s11042-020-10405-7

B. Ding, H. Qian and J. Zhou, "Activation functions and their characteristics in deep neural networks," 2018 Chinese Control And Decision Conference (CCDC), Shenyang, China, 2018, pp. 1836–1841, doi: https://doi.org/10.1109/CCDC.2018.8407425

P. Ratan, “What is the convolutional neural network architecture?,” Analytics Vidhya, May 01, 2025. Available: https://www.analyticsvidhya.com/blog/2020/10/what-is-the-convolutional-neural-network-architecture/

S. Khalid, “BERT explained: A complete guide with theory and tutorial,” Medium, Nov. 03, 2019. Available: https://medium.com/@samia.khalid/bert-explained-a-complete-guide-with-theory-and-tutorial-3ac9ebc8fa7c

Published

2025-12-02

How to Cite

Sahar A. Hussein Altaee. (2025). Comparing BERT’s Performance Against CNN, RNN, and LSTM in Spam Detection. International Journal of Computer Science, Algorithms and Programming Languages, 1(2), 33–45. Retrieved from https://matjournals.net/engineering/index.php/IJCSAPL/article/view/2700

Issue

Section

Articles