Adaptive Replication Strategies for Latency Reduction in Geo-Distributed Databases
Keywords:
Adaptive data replication, Consistency models, Distributed database systems, Fault tolerance, Geo-distributed databases, Latency optimizationAbstract
Geo-distributed databases are fundamental to modern cloud-based and globally distributed applications, enabling data availability and fault tolerance across geographically dispersed regions. However, these systems frequently suffer from high access latency due to physical distance between data centers, fluctuating network conditions, and the overhead introduced by strict consistency guarantees. Traditional static replication strategies, which rely on fixed replica placement and replication factors, are poorly suited to such dynamic environments and often result in inefficient resource utilization and increased response times, particularly for latency-sensitive applications. This study proposes an adaptive replication strategy designed to minimize latency in geo-distributed database systems by dynamically adjusting both replica placement and replication factor in response to changing workload conditions. The approach continuously monitors real-time workload characteristics, including data access frequency, regional demand distribution, and network latency metrics. Predictive models are employed to identify frequently accessed (hot) data items and anticipate future access patterns, enabling proactive replication or relocation of data closer to regions with high demand. By aligning replication decisions with observed and predicted workload behaviour, the proposed strategy seeks to balance performance gains with consistency and resource overhead. The effectiveness of the proposed approach is evaluated through comprehensive simulation and experimental analysis. Key performance metrics, including read and write latency, system throughput, and consistency-related overhead, are measured and compared against conventional static replication schemes. The results demonstrate significant reductions in access latency and improved system responsiveness without incurring excessive consistency costs. These findings highlight the potential of adaptive replication as a practical and scalable solution for latency optimization in geo-distributed database deployments, offering valuable insights for the design of performance-aware distributed data management systems.
References
D. Abadi, “Consistency Tradeoffs in Modern Distributed Database System Design: CAP is Only Part of the Story,” Computer, vol. 45, no. 2, pp. 37–42, Feb. 2012.
P. Bailis, A. Davidson, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica, “Highly available transactions,” Proceedings of the VLDB Endowment, vol. 7, no. 3, pp. 181–192, Nov. 2013.
E. A. Brewer, “Towards robust distributed systems (abstract),” Proceedings of the nineteenth annual ACM symposium on Principles of distributed computing - PODC ’00, 2000.
S. Gilbert and N. Lynch, “Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services,” ACM SIGACT News, vol. 33, no. 2, p. 51, Jun. 2002
W. Vogels, “Eventually consistent,” Communications of the ACM, vol. 52, no. 1, pp. 40–44, Jan. 2009.
Y. Saito and M. Shapiro, “Optimistic replication,” ACM Computing Surveys, vol. 37, no. 1, pp. 42–81, Mar. 2005.
T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis, “The Case for Learned Index Structures,” Proceedings of the 2018 International Conference on Management of Data, May 2018.
P. Bailis, S. Venkataraman, M. J. Franklin, J. M. Hellerstein, and I. Stoica, “Probabilistically bounded staleness for practical partial quorums,” Proceedings of the VLDB Endowment, vol. 5, no. 8, pp. 776–787, Apr. 2012.
Y. Chen and R. Katz, Dynamic Replica Placement for Scalable Content Delivery. First International Workshop, IPTPS 2002, Cambridge, 2002.
Y. Kurkure, S. Sharma, X. Wang, M. E. Papka, and Z. Lan, “CQSim+: Symbiotic Simulation for Multi-Resource Scheduling in High-Performance Computing,” SIGSIM-PADS '25: 39th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, pp. 154–164, Jun. 2025.
E. Brewer, “CAP twelve years later: How the ‘rules’ have changed,” Computer, vol. 45, no. 2, pp. 23–29, Feb. 2012.
L. Bouhouch, M. Zbakh, and C. Tadonki, “Dynamic data replication and placement strategy in geographically distributed data centers,” Concurrency and Computation: Practice and Experience, Feb. 2022.
G. DeCandia et al., “Dynamo: amazon’s highly available key-value store,” SOSP ’07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles, pp. 205–220, 2007.
P. Bailis, S. Venkataraman, M. J. Franklin, J. M. Hellerstein, and I. Stoica, “Quantifying eventual consistency with PBS,” The VLDB Journal, vol. 23, no. 2, pp. 279–302, Sep. 2013.
W. Li, M. Qiao, L. Qin, Y. Zhang, L. Chang, and X. Lin, “Scaling Distance Labeling on Small-World Networks,” Open Publications of UTS Scholars (University of Technology Sydney), Jun. 2019.
T. Nguyen Gia et al., “Energy efficient fog-assisted IoT system for monitoring diabetic patients with cardiovascular disease,” Future Generation Computer Systems, vol. 93, pp. 198–211, Apr. 2019.
P. Bailis, A. Fekete, M. J. Franklin, A. Ghodsi, J. M. Hellerstein, and I. Stoica, “Coordination avoidance in database systems,” Proceedings of the VLDB Endowment, vol. 8, no. 3, pp. 185–196, Nov. 2014.
M. Bergui, S. Najah, and N. S. Nikolov, “A survey on bandwidth-aware geo-distributed frameworks for big-data analytics,” Journal of Big Data, vol. 8, no. 1, Feb. 2021.
K. Liu, J. Peng, J. Wang, W. Liu, Z. Huang, and J. Pan, “Scalable and Adaptive Data Replica Placement for Geo-Distributed Cloud Storages,” IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 7, pp. 1575–1587, Jul. 2020.