Toward Sustainable Big Data Analytics: A Review of Privacy-preserving Federated Learning

Kangana Soni; Nitika Singhi

Authors

Kangana Soni
Nitika Singhi

Keywords:

Data privacy protection, Differential privacy techniques, Distributed and decentralized analytics, Federated learning, Privacy-aware big data analytics, Secure data sharing frameworks, Secure model aggregation

Abstract

The accelerated growth of data-driven systems in health services, the banking sector, automated transport networks, infrastructures, and the internet of things has increased issues linked to data privacy, secure data exchange, growth capability and extended durability of data analysis systems. Traditional centrally controlled massive data analytical processing, which collects unprocessed information at a core central system, encounters privacy disclosure threats, governance-based limitations, elevated data transmission operational load, and heavy power utilization, causing these systems to become gradually inefficient in massive and diverse settings. Federated learning (FL) has developed as a sustainable and privacy-preserving framework by supporting non-centralized predictive model learning while maintaining data at on-site endpoints, thereby decreasing data transfer and enabling legal adherence. This study provides a brief overview of privacy-preserving big data analytics utilizing federated learning with a focus on data privacy protection approaches decentralized analysis, and secure data sharing frameworks. Crucial methods, including differential privacy, secure aggregation, homomorphic encryption, and secure multi-party computation, are evaluated to measure their efficiency in minimizing data exposure from learning model modifications. The overview emphasizes essential exchanges between confidentiality effectiveness, data transmission optimization and scalability and examines federated learning with conventional single-server analysis from a sustainability viewpoint. Lastly, essential investigation shortcomings are determined: restricted applied implementations, insufficient management of non-IID data, absence of consistent assessment metrics and poor interoperability with current big data frameworks. This research intends to facilitate the stated design of secure, scalable, and sustainable big data analytics architectures.

References

K. Mandal and G. Gong, “PrivFL: Practical privacy-preserving federated regressions on high-dimensional data over mobile networks,” in Proc. 2019 ACM SIGSAC Conf. Cloud Computing Security Workshop (CCSW), Nov. 2019, pp. 57–68.

B. Jeon, S. M. Ferdous, M. R. Rahman, and A. Walid, “Privacy-preserving decentralized aggregation for federated learning,” in Proc. IEEE INFOCOM 2021–IEEE Conf. Computer Communications Workshops (INFOCOM WKSHPS), 2021.

M. A. P. Chamikara, P. Bertok, I. Khalil, D. Liu, and S. Camtepe, “Privacy-preserving distributed machine learning with federated learning,” Computer Communications, vol. 171, pp. 112–125, Apr. 2021.

N. Truong, K. Sun, S. Wang, F. Guitton, and Y. Guo, “Privacy preservation in federated learning: An insightful survey from the GDPR perspective,” Computers & Security, vol. 110, Nov. 2021, Art. no. 102402.

X. Yin, Y. Zhu, and J. Hu, “A comprehensive survey of privacy-preserving federated learning: Taxonomy, review, and future directions,” ACM Computing Surveys, vol. 54, no. 6, Art. no. 131, pp. 1–36.

L. Lyu, H. Yu, X. Ma, C. Chen, L. Sun, J. Zhao, Q. Yang, and P. Yu, “Privacy and robustness in federated learning: Attacks and defenses,” IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 7, 2024.

A. Grivet Sébert, R. Sirdey, O. Stan, and C. Gouy-Pailler, “Protecting data from all parties: Combining FHE and DP in federated learning,” May 2022.

T. H. Rafi, F. A. Noor, T. Hussain, and D.-K. Chae, “Fairness and privacy preservation in federated learning: A survey,” Information Fusion, vol. 105, May 2024, Art. no. 102198.

A. I. Jony and M. Mohsin, “Data privacy preservation with federated learning: A systematic review,” Int. J. Data Sci. Big Data Anal., vol. 4, no. 1, pp. 1–16, 2024.

W. Jin et al., “FedML-HE: An efficient homomorphic-encryption-based privacy-preserving federated learning system,” in Advances in Neural Information Processing Systems (NeurIPS 2023), vol. 37, 2023, pp. 1–24.

E. D. Kanmani Ruby, “Advanced privacy-preserving federated learning in 6G networks using differential privacy and homomorphic encryption,” Int. J. Intell. Syst. Appl. Eng., vol. 12, no. 23s, pp. 1–7, 2024.

Y. Chen, Y. Yang, Y. Liang, T. Zhu, and D. Huang, “Federated learning with privacy preservation in large-scale distributed systems using differential privacy and homomorphic encryption,” Informatica, vol. 49, no. 13, 2025.

J. Shen, Y. Zhao, S. Huang, and Y. Ren, “Secure and flexible privacy-preserving federated learning based on multi-key fully homomorphic encryption,” Nov. 2024.

M. P. Robai, “Federated learning for secure and privacy-preserving data analytics in heterogeneous networks,” GSC Adv. Res. Rev., vol. 21, no. 2, pp. 527–555, 2024.

T. K. Chowdhury and S. P. Kudapa, “Federated learning models for privacy-preserving data sharing and secure analytics in healthcare industry,” Int. J. Bus. Econ. Insights, vol. 4, no. 4, pp. 91–133, 2024.

S. Kaleem, A. Ahmad, M. Babar, and G. R. Alavalapati, “Privacy-preserved integrated big data analytics framework using federated learning for intelligent transportation systems,” in Proc. Workshop Security and Privacy in Standardized IoT (SDIoTSec), Feb. 2025, pp. 1–9.

S. Shree, R. Arya, and S. K. Roy, “Enhancing privacy-preserving federated learning using differential privacy,” Int. Res. J. Adv. Eng. Hub, vol. 3, no. 4, pp. 2016–2027, 2025.

R. Haripriya, N. Khare, M. Pandey, and S. Biswas, “A privacy-enhanced framework for collaborative big data analysis in healthcare using adaptive federated learning aggregation,” J. Big Data, vol. 12, Art. no. 113, May 2025.

A. Horst et al., “Federated learning: A privacy-preserving approach to data-centric regulatory cooperation,” Front. Drug Saf. Regul., vol. 5, May 2025.

S. M. Orthi, M. H. Rahman, K. B. Siddiqa, M. Uddin, S. Hossain, A. Al Mamun, and M. N. Khan, “Federated learning with privacy-preserving big data analytics for distributed healthcare systems,” J. Comput. Sci. Technol. Stud., vol. 7, no. 8, pp. 269–281, 2025.

R. Rahman, “Federated learning: A survey on privacy-preserving collaborative intelligence,” arXiv, Aug. 2025.

D. M. Jimenez-Gutierrez, Y. Falkouskaya, J. L. Hernandez-Ramos, A. Anagnostopoulos, I. Chatzigiannakis, and A. Vitaletti, “On the security and privacy of federated learning: A survey with attacks, defenses, frameworks, applications, and future directions,” Information Fusion, vol. 131, July 2026, Art. no. 104155.

S. S. Mohammed, “A decentralized approach to privacy-preserving data analysis using federated learning,” Int. J. Innov. Sci. Res. Technol., vol. 10, no. 9, pp. 2091–2096, Sept. 2025.

M. Bollikonda, “Federated zero-trust: Privacy-preserving analytics across multi-cloud environments,” Preprints, 2025.

K. A. Sathish Kumar, L. Nelson, and B. R. Jibinsingh, “Systematic review of privacy-preserving federated learning in decentralized healthcare systems,” Franklin Open, vol. 13, Dec. 2025, Art. no. 100440.

Toward Sustainable Big Data Analytics: A Review of Privacy-preserving Federated Learning

Authors

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

Current Issue