Filename Spoofing Detection and Prevention: A Study
Keywords:
Behavioral analysis, File upload security, Filename spoofing, Machine learning, MIME type verification, Unicode attacksAbstract
Filename spoofing is an evasive technique used by attackers during file-based malware attacks to trick or deceive users by modifying file names, extensions, and encoding schemes. This technique takes advantage of operating systems and interface limitations to evade detection by pretending that a malicious executable file is a legitimate file type, such as a PDF file, image file, or text file. Techniques such as double extensions, hidden file extensions, Unicode homograph attacks, BIDI control character exploitation, and MIME type mismatches have greatly enhanced filename spoofing attacks. Microsoft Windows and Apple Mac operating systems are highly vulnerable to filename spoofing because they utilize filename extensions to identify file types. In the last ten years, filename spoofing has witnessed significant growth and development, including rule-based detection systems and sophisticated machine learning and behavioral analysis models. Security analysis services such as VirusTotal have proven that conventional signature-based antivirus systems are ineffective against filename spoofing because filename manipulation does not change the binary signature of a file. This has led to the development of MIME header verification, entropy-based anomaly detection, Unicode normalizations, and machine learning classifiers to detect and identify filename spoofing attacks effectively. This study provides a comprehensive overview of filename spoofing.
References
K. Cabaj, M. Gregorczyk, and W. Mazurczyk, “Software-defined networking-based crypto ransomware detection using HTTP traffic characteristics,” Computers & Electrical Engineering, vol. 66, pp. 353–368, Feb. 2018.
Unicode Consortium, “Unicode security mechanisms,” Unicode Standard Annex #39, 2023.
L. Invernizzi, K. Thomas, A. Kapravelos, O. Comanescu, J.-M. Picod, and E. Bursztein, “Cloak of visibility: Detecting when machines browse a different web,” in Proc. IEEE Symp. Security and Privacy, San Jose, CA, USA, May 22–26, 2016.
Z. Wenhua et al., “Data security in smart devices: Advancement, constraints and future recommendations,” IET Networks, vol. 12, no. 6, pp. 269–281, 2023.
Google, “VirusTotal – File and URL analysis service,” 2024.
P. Kanwal et al., “Machine learning-enhanced malware obfuscation and innovative defense strategies,” IEEE Access, vol. 14, Jan. 2026.
J. Mishra, M. Chhibber, H. Shim, and T. H. Kinnunen, “Towards explainable spoofed speech attribution and detection: A probabilistic approach for characterizing speech synthesizer components,” Computer Speech & Language, vol. 95, p. 101840, Jan. 2026.
Y. Ding, N. Luktarhan, K. Li, and W. Slamu, “A keyword-based combination approach for detecting phishing webpages,” Computers & Security, vol. 84, pp. 256–275, Jul. 2019.
C. Catal, G. Giray, B. Tekinerdogan, S. Kumar, and S. Shukla, “Applications of deep learning for phishing detection: A systematic literature review,” Knowledge and Information Systems, vol. 64, no. 6, pp. 1457–1500, May 2022.
S. M. and A. R. Pais, “Classification of phishing email using word embedding and machine learning techniques,” Journal of Cyber Security and Mobility, vol. 11, no. 3, May 2022.
A. Brunello, A. Montanari, and N. Saccomanno, “Towards interpretability in fingerprint-based indoor positioning: May attention be with us,” Expert Systems with Applications, vol. 231, p. 120679, Nov. 2023.
A. S., S. Kashyap, D. Patel, and Pavan, “Hybrid deep learning and machine learning framework for automated pneumonia detection in chest X-ray images,” MethodsX, vol. 15, p. 103729, Nov. 2025.
H. M. U. Akhtar, M. Nauman, N. Akhtar, M. Hameed, S. Hameed, and M. Z. Tareen, “Mitigating cyber threats: Machine learning and explainable AI for phishing detection,” VFAST Transactions on Software Engineering, vol. 13, no. 2, pp. 170–195, Jun. 2025.