Benchmark Study On Replication Lag in PostgreSQL Using Single Region and Multi-Region Architectures
Benchmark Study On Replication Lag in PostgreSQL Using Single Region and Multi-Region Architectures
Contents
Executive Summary 3
Introduction 4
Purpose of the Study 4
Overview of Replication in PostgreSQL 4
Benchmark Environment 5
Cluster Setup 5
Tools and Versions 6
Hardware/OS Specifications
6
Methodology
7
Configuration and Deployment
7
Measurement of Replication Lag
8
Load Testing Tools
9
Results
9
Single-Region Results
11
Multi-Region Results
Comparative Analysis 12
Discussion 13
Performance Analysis 13
stormatics.tech 2
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
1. Executive Summary
This whitepaper presents the findings from a benchmark study aimed at evaluating replication
lag across different PostgreSQL cluster architectures using Postgres Distributed (PGD) from
EDB. Two primary configurations were tested: a single-region, single-active node architecture
with two standby nodes, and a multi-region active-active architecture, where each active node
had two standby nodes.
The benchmarking tools used were pgbench for simulating workloads in both environments
and HammerDB for load generation in one of the regions in the multi-region setup. The study
results demonstrate that PostgreSQL with PGD can maintain minimal replication lag in both
single-region and multi-region setups, supporting the technology's robustness in distributed
database environments.
stormatics.tech 3
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
2. Introduction
As modern applications become more distributed, database systems must ensure high
availability and low replication lag to support global-scale workloads. This study investigates the
replication performance of PostgreSQL clusters, comparing a single-region, active-standby
setup against a multi-region, active-active architecture.
The goal is to measure and compare replication lag—the time taken for a committed transaction
on the primary node to be reflected on standby nodes—and to assess the performance under
real-world conditions, using PGD by EDB.
Replication in PostgreSQL ensures that data is kept consistent across a primary node and its
standby nodes. PGD extends this functionality to handle more complex, distributed setups, such
as multi-region clusters with active-active architectures. The replication lag—the delay between
a commit on the primary node and its appearance on the standby node—can have significant
implications for performance, availability, and disaster recovery.
This study focuses on measuring replication lag and its impact in both single-region and multi-
region PostgreSQL clusters.
stormatics.tech 4
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
3. Benchmark Environment
Single-Region Setup: One primary node with two standby nodes, all located within a single
region.
Multi-Region Setup: Two geographic regions, each containing a primary node with two
associated standby nodes. The two regions were interconnected, enabling cross-region
replication.
stormatics.tech 5
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
stormatics.tech 6
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
4. Methodology
A single primary node and two standby nodes were deployed within the same data center,
resulting in low network latency between the nodes.
Load Generation: pgbench was used to simulate a transactional workload on the primary
node, continuously generating read-write queries to stress the system.
Two regions were configured, each with a primary node and two standby nodes. The two
regions were geographically distributed to simulate real-world global deployments, with
cross-region replication occurring between primary nodes.
Load Generation:
Region A: pgbench was used to generate a load of transactions on the primary node.
Region B: HammerDB was used to simulate a separate load, providing a contrasting
load pattern compared to pgbench.
• Create the file repository configuration
The lag was calculated as the difference between the commit timestamp on the primary
node and the replay timestamp on the standby nodes, with monitoring at regular
intervals to capture fluctuations under load.
stormatics.tech 7
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
stormatics.tech 8
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
5. Results
5.1 Single-Region Results
The cluster was stressed for 5 hours, with 30 equally spaced checkpoints.
The resulting graphs from the replication lag measurement for Kaboom-Kaolin and for Kaboom-
Kaftan are as follows:
stormatics.tech 9
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
The average replication lag in the single-region setup was 2.36 milliseconds during normal
load conditions generated by pgbench.
Peak Lag: Occurred during periods of intense load spikes, reaching up to 9.4 milliseconds.
The minimal lag observed highlights the efficiency of replication within the same data
center.
Standby nodes in the single-region architecture replayed transactions rapidly, maintaining
near real-time synchronization.
stormatics.tech 10
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
The cluster was stressed for 1 hour, with 30 equally spaced checkpoints.
The resulting graphs from the replication lag measurement for Kaboom-Kaolin and for
Kaboom-Kaftan are as follows:
stormatics.tech 11
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
In the single-region setup, replication lag was consistently lower due to the close proximity
of nodes and high-speed network connections.
The multi-region architecture, although experiencing higher replication lag, still
demonstrated manageable delays, highlighting PGD's resilience in distributed environments.
stormatics.tech 12
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
6. Discussion
6.1 Performance Analysis
The performance results from this study offer a clear understanding of how PGD manages
replication across both single-region and multi-region PostgreSQL clusters under different load
scenarios.
Single-Region Setup: The low replication lag (averaging 2.36 milliseconds) demonstrates
the effectiveness of streaming replication in environments where network latency is
minimal. The transaction commit and replay times were consistent across various pgbench-
generated load intensities, showing that PostgreSQL’s built-in replication mechanism with
PGD handles low-latency environments efficiently.
Multi-Region Setup: As expected, the multi-region setup introduced significantly higher
replication lag (averaging 29.25 milliseconds) due to the cross-region network latencies.
Despite the added complexity, PGD managed the replication across geographic locations
well, ensuring the replication lag remained within tolerable limits for all use cases.
The cross-region spikes observed, especially under HammerDB's workload, revealed that
network latency and the heterogeneity of workloads across regions can cause temporary peaks
in lag. However, PGD's design to support multi-region clusters ensures that replication, while
delayed, remains consistent and without data loss, making it suitable for global-scale
applications where availability and disaster recovery are prioritized.
stormatics.tech 13
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
pgbench and HammerDB were chosen for this study because they represent different kinds of
database workloads. Their characteristics and how they impacted replication lag are critical to
understanding the benchmarks.
6.2.1 pgbench
Type of Workload: pgbench simulates typical OLTP (Online Transaction Processing) TPC-B
type workloads, which consist of short transactions that are mostly read-write in nature.
Load Characteristics: pgbench workloads tend to be predictable, with a standard set of
queries that stress the system uniformly. This creates a steady, consistent load, ideal for
assessing the impact of continuous, regular transactions on replication.
Impact on Replication Lag: pgbench’s short and uniform transaction patterns in the single-
region setup caused minimal fluctuations in replication lag. In the multi-region setup, the
replication lag observed was still within acceptable bounds, since the load wasn’t bursty or
highly variable.
Best Use Case: pgbench is ideal for simulating real-time OLTP systems where the focus is
on short, frequent transactions, such as retail applications, banking systems, or reservation
systems.
6.2.2 HammerDB
Type of Workload: HammerDB was configured to generate TPC-C workloads, a benchmark
that models a complex transactional environment involving multiple types of transactions,
including heavier, more resource-intensive operations such as warehouse management and
order processing.
Load Characteristics: Unlike pgbench, HammerDB generates a more complex, less
predictable load, simulating more varied real-world workloads. TPC-C workloads include a
mix of short, simple transactions along with more complex queries that involve several
tables and result in higher transaction execution times.
Impact on Replication Lag: HammerDB’s variability resulted in occasional spikes in
replication lag, particularly in the multi-region setup. The TPC-C workload introduced a more
uneven load, stressing the network and replication mechanisms more heavily than pgbench.
The maximum replication lag observed during HammerDB's high-load periods was 269
milliseconds, highlighting the challenges of handling more complex workloads in a multi-
region setup.
stormatics.tech 14
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
Best Use Case: HammerDB’s TPC-C simulation is suited for systems requiring complex
transactional workloads with multiple query types, such as e-commerce platforms or
enterprise resource planning (ERP) systems, where transactions are often varied and
resource-intensive.
stormatics.tech 15
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
7. Conclusion
This benchmark study demonstrates the versatility of PGD from EDB in maintaining replication
integrity across both single-region and multi-region PostgreSQL clusters. The results show that
while single-region architectures benefit from minimal replication lag (averaging 2.36 ms), multi-
region setups naturally introduce higher latencies (averaging 29.25 ms), especially when
workloads are heterogeneous, as seen with the use of pgbench and HammerDB.
Despite the differences in performance between the two architectures, PGD ensures reliable
replication across nodes, making it an excellent solution for a wide range of use cases, from
local high-performance systems to globally distributed databases requiring high availability and
disaster recovery capabilities.
Key findings:
Ultimately, PGD’s ability to maintain data consistency and minimize replication lag, even in
geographically distributed systems, highlights its value in modern, distributed database
architectures.
stormatics.tech 16
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
8. Recommendations
Based on the results of this benchmark, we recommend the following:
For businesses requiring minimal replication lag and high transaction throughput, a single-region
architecture using PGD is optimal. The observed low lag ensures real-time data consistency,
making it suitable for applications where immediate failover and fast data replication are critical.
For global applications that prioritize disaster recovery and high availability across regions, the
multi-region architecture with PGD offers a robust solution. While replication lag is higher due to
network delays, it remains within an acceptable range for most global-scale applications,
especially those that can tolerate asynchronous replication delays.
Understanding the workload profile is crucial. If your application relies on simple OLTP
transactions, pgbench-like workloads will cause minimal lag even in multi-region setups.
However, if your application involves complex transactions and resource-heavy queries, as
simulated by HammerDB, expect some lag spikes in multi-region setups. Fine-tuning PGD’s
replication configuration, network optimizations, and careful consideration of workload
distribution across regions can help mitigate these spikes.
stormatics.tech 17
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures
As your business grows, your database's performance, availability, and compliance can
start to lag. These bottlenecks can lead to costly downtime, frustrated users, and
missed opportunities. Left unchecked, these issues can threaten your entire
application stack and damage customer trust. While many providers push generic
solutions, we tailor our services to your unique needs for PostgreSQL.
Reliable Solutions
You want your database to be reliable, we have the expertise to make sure you can
depend on PostgreSQL at scale.
Customer Satisfaction
Our customers are our biggest asset, and our focus is on excellence in service
delivery. We will not rest till you are 100% satisfied.
Customized Services
Your challenges are unique, and so are our services. Our team collaborates with
yours to deliver customized solutions.
stormatics.tech