A Review of Big Data Analysis for Financial Fraud Identification Based on Machine Learning Approaches in Credit Card Systems

Authors

  • Amit Asthana Postgraduate Student, Department of Computer Science and Engineering, LNCT Group of Colleges, Bhopal, Madhya Pradesh, India
  • Raj Kumar Sharma Assistant Professor, Department of Computer Science and Engineering, LNCT Group of Colleges, Bhopal, Madhya Pradesh, India

Keywords:

Apache Hadoop, Big data techniques, Credit card fraud, Financial fraud detection, Machine Learning

Abstract

The amount of monetary transactions has expanded exponentially, and with it, the challenge of identifying credit card fraud. Traditional methods are struggling to keep pace with the evolving sophistication of fraudulent activities. This study delves into the use of big data platforms like Hadoop and Spark to enhance the detection and prevention of credit card theft through the incorporation of ML. The goal is to address this increasing concern. That demonstrate how ML algorithms can search through massive amounts of transaction data for anomalies and intricate fraud patterns that might otherwise be missed by more conventional methods using these state-of-the-art technologies. The capacity of several machine learning methods, such as supervised and unsupervised learning, to improve detection accuracy while reducing false positives is explored in this research.

Additionally, the paper examines key challenges, including data privacy concerns, scalability issues, and the need for adaptive models that can evolve with emerging fraud tactics. The results indicate that fraud detection systems can be made much more efficient and successful by using big data analytics in combination with ML. Future work in this domain could explore the development of hybrid models that combine ML with other emerging technologies, like blockchain, to provide even more robust security. Furthermore, privacy-preserving techniques and real-time fraud detection models remain important areas for future research, as it holds the potential to further strengthen the security of financial transactions and protect consumers from financial losses.

References

I. Sadgali, N. Sael, and F. Benabbou, "Performance of machine learning techniques in the detection of financial frauds," Procedia Computer Science, vol. 148, pp. 45–54, Jan. 2019. doi: https://doi.org/10.1016/j.procs.2019.01.007

A. Cherif, A. Badhib, H. Ammar, S. Alshehri, M. Kalkatawi, and A. Imine, “Credit card fraud detection in the era of disruptive technologies: A systematic review,” J. King Saud Univ. – Comput. Inf. Sci., vol. 35, no. 1, pp. 145–174, Jan. 2023, doi: https://doi.org/10.1016/j.jksuci.2022.11.008

A. Abdallah, M. A. Maarof, and A. Zainal, "Fraud detection system: A survey," J. Netw. Comput. Appl., vol. 68, pp. 90–113, Jun. 2016, doi: https://doi.org/10.1016/j.jnca.2016.04.007

A. N. Ahmed and R. Saini, "A Survey on Detection of Fraudulent Credit Card Transactions Using Machine Learning Algorithms," 2023 3rd International Conference on Intelligent Communication and Computational Techniques (ICCT), Jaipur, India, Mar. 2023, pp. 1-5, doi: https://doi.org/10.1109/ICCT56969.2023.10076122

S. Tyagi, "Analyzing machine learning models for credit scoring with explainable AI and optimizing investment decisions," arXiv preprint arXiv:2209.09362, Sep. 19, 2022. Available: https://doi.org/10.48550/arXiv.2209.09362

Y. Dai, J. Yan, X. Tang, H. Zhao and M. Guo, "Online Credit Card Fraud Detection: A Hybrid Framework with Big Data Technologies," 2016 IEEE Trustcom/BigDataSE/ISPA, Tianjin, China, 2016, pp. 1644-1651, doi: https://doi.org/10.1109/TrustCom.2016.0253

H. Sinha, "A comprehensive study on air quality detection using ML algorithms," J. Emerg. Technol. Innov. Res. (JETIR), vol. 11, no. 9, pp. b116–b122, Sep. 2024. Available: https://www.jetir.org/view.php?paper=JETIR2409115

M. M. Najafabadi, F. Villanustre, T. M. Khoshgoftaar, N. Seliya, R. Wald, and E. Muharemagic, "Deep learning applications and challenges in big data analytics," J. Big Data, vol. 2, no. 1, pp. 1–21, Dec. 2015. Available: https://link.springer.com/article/10.1186/s40537-014-0007-7

P. Gupta and N. Tyagi, "An approach towards big data — A review," International Conference on Computing, Communication & Automation, Greater Noida, India, 2015, pp. 118-123, doi: https://doi.org/10.1109/ccaa.2015.7148356

S. Tahir and W. Iqbal, "Big Data — An evolving concern for forensic investigators," 2015 First International Conference on Anti-Cybercrime (ICACC), Riyadh, Saudi Arabia, 2015, pp. 1-6, doi: 10.1109/Anti-Cybercrime.2015.7351932. Doi: 10.1109/Anti-Cybercrime.2015.7351932

Chen XW, Lin X. Big data deep learning: challenges and perspectives. IEEE access. 2014 May 16; 2:514-25. 10.1109/ACCESS.2014.2325029

S. Dixit and D. Yadav, “Recent Developments in IoT Security and Privacy: A Review of Best Practices with Challenges and Emerging Solutions,” International Journal of Innovative Science and Research Technology, pp. 1888–1894, Apr. 2025, doi: https://doi.org/10.38124/ijisrt/25mar1837.

Dean J, Ghemawat S. MapReduce: simplified data processing on large clusters. Communications of the ACM. 2008 Jan 1; 51(1):107-13. https://doi.org/10.1145/1327452.1327492

K. Grolinger, M. Hayes, W. A. Higashino, A. L'Heureux, D. S. Allison and M. A. M. Capretz, "Challenges for MapReduce in Big Data," 2014 IEEE World Congress on Services, Anchorage, AK, USA, 2014, pp 2378-3818, doi: https://doi.org/10.1109/SERVICES.2014.41

R. Anbuvizhi and V. Balakumar, “Credit / Debit Card Transaction Survey Using Map Reduce in HDFS and Implementing Syferlock to Prevent Fraudulent,” Int. J. Comput. Sci. Netw. Secur. Vol 16, no 11, pp. 106-110, 2016, Available http://paper.ijcsns.org/07_book/201611/20161116.pdf

R. Bhukya, “Fuzzy Clustering Driven Fast And Intuitive Classifier Learning With Mapreduce Framework,” Journal of Theoretical and Applied Information Technology, vol. 95, no. 8, 2017, Accessed: May 14, 2025. https://www.jatit.org/volumes/Vol95No8/18Vol95No8.pdf

M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica, “Spark: cluster computing with working sets,” pp. 10–10, Jun. 2010. https://www.usenix.org/legacy/event/hotcloud10/tech/full_papers/Zaharia.pdf

H. S. Chandu, “Efficient Machine Learning Approaches for Energy Optimization in Smart Grid Systems - IJSART,” IJSART, vo10, no 9, 2025. https://ijsart.com/efficient-machine-learning-approaches-for-energy-optimization-in-smart-grid-systems-99428

R. Muthuvel, Suyambu, and P. Kumar Vishwakarma, “An Efficient Machine Learning Based Solutions for Renewable Energy System,” IJRAR22D3208 International Journal of Research and Analytical Reviews, vol. 9, no. 4, 2022, Available: https://www.ijrar.org/papers/IJRAR22D3208.pdf

E. Vlachou, A. Karras, C. Karras, L. Theodorakopoulos, C. Halkiopoulos, “Distributed Bayesian Inference for Large-Scale IoT Systems,” Big data and cognitive computing, vol. 8, no. 1, pp. 1–1, Dec. 2023, doi: https://doi.org/10.3390/bdcc8010001

A. Karras,A . Giannaros, C. Karras, “TinyML Algorithms for Big Data Management in Large-Scale IoT Systems,” ProQuest, vol. 15, no. 2, p. 42, 2024, doi: https://doi.org/10.3390/fi16020042

A. R. Khalid, N. Owoh, O. Uthmani, M. Ashawa, J. Osamor, and J. Adejoh, “Enhancing Credit Card Fraud Detection: An Ensemble Machine Learning Approach,” Big Data and Cognitive Computing, vol. 8, no. 1, p. 6, Jan. 2024, doi: https://doi.org/10.3390/bdcc8010006

H. Sinha, “Predicting Employee Performance in Business Environments Using Effective Machine Learning Models,” International Journal of Novel Research and Development, vol. 9, no. 9, pp. a875–a881a875–a881, Sep. 2024, doi: https://doi.org/10.5281/zenodo.13771036

M. Gopalsamy. Identification And Classification of Phishing Emails Based on Machine Learning Techniques to Improvise Cyber security. IJSART. 2024; vol.10, no.10:47-55. https://ijsart.com/identification-and-classification-of-phishing-emails-based-on-machine-learning-techniques-to-improvise-cybersecurity-99450

Z. Li ,B. Wang, J. Huang, Y. Jin, “A graph-powered large-scale fraud detection system,” International Journal of Machine Learning and Cybernetics, vol. 15, no. 1, pp. 115–128, Feb. 2023, doi: https://doi.org/10.1007/s13042-023-01786-w

S. Murri, M. Bhoyar, and G. P. Selvarajan, “Transforming Decision-Making with Big Data Analytics: Advanced Approaches to Real-Time Insights, Predictive Modeling, and Scalable Data Integration,” International Journal of Communication Networks and Information Security (IJCNIS), vol.16, no.5, 506–519., 2024. https://ijcnis.org/index.php/ijcnis/article/view/7838

K. Singh, P. Kolar, R. Abraham, V. Seetharam, and D. Kumar, “Automated Secure Computing for Fraud Detection in Financial Transactions,” Wiley Publications, pp. 177–189, Nov. 2023, doi: https://doi.org/10.1002/9781394213948.ch9

A. A. Arfeen and B. M. A. Khan, “Empirical Analysis of Machine Learning Algorithms on Detection of Fraudulent Electronic Fund Transfer Transactions,” IETE Journal of Research, pp. 1–13, Mar. 2022, doi: https://doi.org/10.1080/03772063.2022.2048700

A. Dimitriadou and A. Gregoriou, “Predicting Bitcoin Prices Using Machine Learning,” Entropy, vol. 25, no. 5, p. 777, May 2023, doi: https://doi.org/10.3390/e25050777

Y. Zhao, G. Zheng, S. Mukherjee, and R. J. McCann, “ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels,” Proceedings of the ... AAAI Conference on Artificial Intelligence, vol. 37, no. 4, pp. 4937–4945, Jun. 2023, doi: https://doi.org/10.1609/aaai.v37i4.25620

S. Zehra, U. Faseeha, H. S. Syed, “Machine Learning-Based Anomaly Detection in NFV: A Comprehensive Survey,” Sensors, vol. 23, no. 11, p. 5340, Jan. 2023, doi: https://doi.org/10.3390/s23115340

H. Antonopoulou, L. Theodorakopoulos, C. Halkiopoulos, and V. Mamalougkou, “Utilizing Machine Learning to Reassess the Predictability of Bank Stocks,” Emerging science journal, vol. 7, no. 3, pp. 724–732, May 2023, doi: https://doi.org/10.28991/esj-2023-07-03-04

A. Ali, “Financial Fraud Detection Based on Machine Learning: A Systematic Literature Review,” Applied Sciences, vol. 12, no. 19, p. 9637, 2022, doi: https://doi.org/10.3390/app12199637

R. Almutairi, A. Godavarthi, A. R. Kotha and E. Ceesay, "Analyzing Credit Card Fraud Detection based on Machine Learning Models," 2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS), Toronto, ON, Canada, 2022, pp. 1-8, doi: https://doi.org/10.1109/IEMTRONICS55184.2022.9795737

E. Btoush, X. Zhou, R. Gururaian, K. Chan and X. Tao, "A Survey on Credit Card Fraud Detection Techniques in Banking Industry for Cyber Security," 2021 8th International Conference on Behavioral and Social Computing (BESC), Doha, Qatar, 2021, pp. 1-7, doi: https://doi.org/10.1109/BESC53957.2021.9635559

A. Shivanna, S. Ray, K. Alshouiliy and D. P. Agrawal, "Detection of Fraudulence in Credit Card Transactions using Machine Learning on Azure ML," 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), New York, NY, USA, 2020, pp. 0268-0273, doi: https://doi.org/10.1109/UEMCON51285.2020.9298129

B. K. Jha, G. G. Sivasankari, and K. R. Venugopal, “Fraud Detection and Prevention by using Big Data Analytics,” IEEE Xplore, Mar. 2020. pp. 267-274, doi: https://doi.org/10.1109/ICCMC48092.2020.ICCMC-00050

Published

2025-05-26

Issue

Section

Articles