0% found this document useful (0 votes)
31 views18 pages

Benchmark Study On Replication Lag in PostgreSQL Using Single Region and Multi-Region Architectures

Uploaded by

vignesh murugan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views18 pages

Benchmark Study On Replication Lag in PostgreSQL Using Single Region and Multi-Region Architectures

Uploaded by

vignesh murugan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Benchmark Study on

Replication Lag in PostgreSQL


using Single Region and Multi-
Region Architectures
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

Contents

Executive Summary 3
Introduction 4
Purpose of the Study 4
Overview of Replication in PostgreSQL 4
Benchmark Environment 5
Cluster Setup 5
Tools and Versions 6
Hardware/OS Specifications
6
Methodology
7
Configuration and Deployment
7
Measurement of Replication Lag
8
Load Testing Tools
9
Results
9
Single-Region Results
11
Multi-Region Results
Comparative Analysis 12

Discussion 13

Performance Analysis 13

Analysis of Tools: pgbench vs. HammerDB 14


Real-World Implications 15
Conclusion 16
Recommendations 17

stormatics.tech 2
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

1. Executive Summary
This whitepaper presents the findings from a benchmark study aimed at evaluating replication
lag across different PostgreSQL cluster architectures using Postgres Distributed (PGD) from
EDB. Two primary configurations were tested: a single-region, single-active node architecture
with two standby nodes, and a multi-region active-active architecture, where each active node
had two standby nodes.
The benchmarking tools used were pgbench for simulating workloads in both environments
and HammerDB for load generation in one of the regions in the multi-region setup. The study
results demonstrate that PostgreSQL with PGD can maintain minimal replication lag in both
single-region and multi-region setups, supporting the technology's robustness in distributed
database environments.

stormatics.tech 3
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

2. Introduction

2.1 Purpose of the Study

As modern applications become more distributed, database systems must ensure high
availability and low replication lag to support global-scale workloads. This study investigates the
replication performance of PostgreSQL clusters, comparing a single-region, active-standby
setup against a multi-region, active-active architecture.

The goal is to measure and compare replication lag—the time taken for a committed transaction
on the primary node to be reflected on standby nodes—and to assess the performance under
real-world conditions, using PGD by EDB.

2.2 Overview of Replication in PostgreSQL

Replication in PostgreSQL ensures that data is kept consistent across a primary node and its
standby nodes. PGD extends this functionality to handle more complex, distributed setups, such
as multi-region clusters with active-active architectures. The replication lag—the delay between
a commit on the primary node and its appearance on the standby node—can have significant
implications for performance, availability, and disaster recovery.

This study focuses on measuring replication lag and its impact in both single-region and multi-
region PostgreSQL clusters.

stormatics.tech 4
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

3. Benchmark Environment

3.1 Cluster Setup

Two distinct PostgreSQL cluster configurations were tested using PGD:

Single-Region Setup: One primary node with two standby nodes, all located within a single
region.

Multi-Region Setup: Two geographic regions, each containing a primary node with two
associated standby nodes. The two regions were interconnected, enabling cross-region
replication.

stormatics.tech 5
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

3.2 Tools and Versions

PGD Version: 5.5.0 Build Date - 2024-05-16T09:42:16Z


PostgreSQL Version: PostgreSQL 16.4 (EnterpriseDB Advanced Server 16.4.1)
Load Generation Tools:
pgbench: scale 10 c1 j1 t 18000
HammerDB: pg_total_iterations = 100k / vuset vu 20

3.3 Hardware/OS Specifications

Platform: Amazon Web Services (AWS) EC2


Operating System: Debian GNU/Linux 11 (Bullseye)
Hard Drive Type: GP2
Instance Type: t3.medium

stormatics.tech 6
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

4. Methodology

4.1 Configuration and Deployment

4.1.1 Single-Region Setup

A single primary node and two standby nodes were deployed within the same data center,
resulting in low network latency between the nodes.
Load Generation: pgbench was used to simulate a transactional workload on the primary
node, continuously generating read-write queries to stress the system.

4.1.2 Multi-Region Setup

Two regions were configured, each with a primary node and two standby nodes. The two
regions were geographically distributed to simulate real-world global deployments, with
cross-region replication occurring between primary nodes.
Load Generation:
Region A: pgbench was used to generate a load of transactions on the primary node.
Region B: HammerDB was used to simulate a separate load, providing a contrasting
load pattern compared to pgbench.
• Create the file repository configuration

4.2 Measurement of Replication Lag

Replication lag was measured using PGD’s built-in command:

pgd show-replslots --verbose

The lag was calculated as the difference between the commit timestamp on the primary
node and the replay timestamp on the standby nodes, with monitoring at regular
intervals to capture fluctuations under load.

stormatics.tech 7
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

4.3 Load Testing Tools

pgbench: pgbench is a benchmarking tool that generates a configurable workload


consisting of a mix of read and write transactions, simulating a variety of scenarios that
represent typical OLTP (Online Transaction Processing) systems.
HammerDB: HammerDB is a powerful benchmarking tool designed to simulate workloads
on databases using a TPC-C-like transaction model. It provides a good representation of the
demands of an OLTP system with higher variability in transaction types and sizes compared
to pgbench.

• Create the file repository configuration

stormatics.tech 8
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

5. Results
5.1 Single-Region Results

The cluster was stressed for 5 hours, with 30 equally spaced checkpoints.

The resulting graphs from the replication lag measurement for Kaboom-Kaolin and for Kaboom-
Kaftan are as follows:

stormatics.tech 9
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

5.1.1 Replication Lag

The average replication lag in the single-region setup was 2.36 milliseconds during normal
load conditions generated by pgbench.
Peak Lag: Occurred during periods of intense load spikes, reaching up to 9.4 milliseconds.

5.1.2 Analysis of Results

The minimal lag observed highlights the efficiency of replication within the same data
center.
Standby nodes in the single-region architecture replayed transactions rapidly, maintaining
near real-time synchronization.

stormatics.tech 10
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

5.2 Multi-Region Results

The cluster was stressed for 1 hour, with 30 equally spaced checkpoints.

The resulting graphs from the replication lag measurement for Kaboom-Kaolin and for
Kaboom-Kaftan are as follows:

stormatics.tech 11
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

5.2.1 Replication Lag


Cross-region replication lag was notably higher, averaging 29.25 milliseconds, due to the
network latency introduced by geographic distance.
Peak lag occurred during intense load spikes, especially in Region A. The multi-region
system managed to keep replication consistent but with occasional spikes in lag reaching
269 milliseconds.

5.2.2 Factors Influencing Lag


The cluster had twice the workload with two benchmarks putting load in parallel to the
system.
Network latency between regions was a significant factor.
However, the replication lag is at sub-second values even at its worst.

5.3 Comparative Analysis

In the single-region setup, replication lag was consistently lower due to the close proximity
of nodes and high-speed network connections.
The multi-region architecture, although experiencing higher replication lag, still
demonstrated manageable delays, highlighting PGD's resilience in distributed environments.

stormatics.tech 12
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

6. Discussion
6.1 Performance Analysis

The performance results from this study offer a clear understanding of how PGD manages
replication across both single-region and multi-region PostgreSQL clusters under different load
scenarios.

Single-Region Setup: The low replication lag (averaging 2.36 milliseconds) demonstrates
the effectiveness of streaming replication in environments where network latency is
minimal. The transaction commit and replay times were consistent across various pgbench-
generated load intensities, showing that PostgreSQL’s built-in replication mechanism with
PGD handles low-latency environments efficiently.
Multi-Region Setup: As expected, the multi-region setup introduced significantly higher
replication lag (averaging 29.25 milliseconds) due to the cross-region network latencies.
Despite the added complexity, PGD managed the replication across geographic locations
well, ensuring the replication lag remained within tolerable limits for all use cases.

The cross-region spikes observed, especially under HammerDB's workload, revealed that
network latency and the heterogeneity of workloads across regions can cause temporary peaks
in lag. However, PGD's design to support multi-region clusters ensures that replication, while
delayed, remains consistent and without data loss, making it suitable for global-scale
applications where availability and disaster recovery are prioritized.

stormatics.tech 13
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

6.2 Analysis of Tools: pgbench vs. HammerDB

pgbench and HammerDB were chosen for this study because they represent different kinds of
database workloads. Their characteristics and how they impacted replication lag are critical to
understanding the benchmarks.

6.2.1 pgbench
Type of Workload: pgbench simulates typical OLTP (Online Transaction Processing) TPC-B
type workloads, which consist of short transactions that are mostly read-write in nature.
Load Characteristics: pgbench workloads tend to be predictable, with a standard set of
queries that stress the system uniformly. This creates a steady, consistent load, ideal for
assessing the impact of continuous, regular transactions on replication.
Impact on Replication Lag: pgbench’s short and uniform transaction patterns in the single-
region setup caused minimal fluctuations in replication lag. In the multi-region setup, the
replication lag observed was still within acceptable bounds, since the load wasn’t bursty or
highly variable.
Best Use Case: pgbench is ideal for simulating real-time OLTP systems where the focus is
on short, frequent transactions, such as retail applications, banking systems, or reservation
systems.

6.2.2 HammerDB
Type of Workload: HammerDB was configured to generate TPC-C workloads, a benchmark
that models a complex transactional environment involving multiple types of transactions,
including heavier, more resource-intensive operations such as warehouse management and
order processing.
Load Characteristics: Unlike pgbench, HammerDB generates a more complex, less
predictable load, simulating more varied real-world workloads. TPC-C workloads include a
mix of short, simple transactions along with more complex queries that involve several
tables and result in higher transaction execution times.
Impact on Replication Lag: HammerDB’s variability resulted in occasional spikes in
replication lag, particularly in the multi-region setup. The TPC-C workload introduced a more
uneven load, stressing the network and replication mechanisms more heavily than pgbench.
The maximum replication lag observed during HammerDB's high-load periods was 269
milliseconds, highlighting the challenges of handling more complex workloads in a multi-
region setup.

stormatics.tech 14
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

Best Use Case: HammerDB’s TPC-C simulation is suited for systems requiring complex
transactional workloads with multiple query types, such as e-commerce platforms or
enterprise resource planning (ERP) systems, where transactions are often varied and
resource-intensive.

6.3 Replication Strategy with PGD

6.3.1 Synchronous vs. Asynchronous Replication


Synchronous replication is used in the setups that need to ensure no transaction is confirmed to
clients until it is committed on both the primary and at least one standby node. This approach
ensures data consistency. In contrast, asynchronous replication optimizes performance. This
strategy introduces some replication lag but allows transactions to commit faster on the
primary nodes without waiting for replication to complete on geographically distant nodes.

6.3.2 Replication Lag and Scalability


The ability to keep replication lag within acceptable limits while handling different load patterns
(pgbench and HammerDB) demonstrates that PGD provides a scalable solution for both single-
region, high-availability deployments and multi-region, disaster-tolerant architectures.

6.4 Real-World Implications

6.4.1 Single-Region Architecture


The results suggest that for businesses requiring low-latency, high-availability environments
(e.g., financial services or high-frequency trading), a single-region architecture with PGD is an
optimal solution. The minimal replication lag ensures near real-time data consistency.

6.4.2 Multi-Region Architecture


For global applications (e.g., multinational e-commerce platforms, logistics companies, or any
service requiring geographic redundancy), the multi-region setup, while introducing some
replication lag, provides the necessary robustness for disaster recovery and availability. The lag
observed in this setup is manageable and acceptable for scenarios where some latency can be
tolerated in favor of geographic redundancy.

stormatics.tech 15
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

7. Conclusion
This benchmark study demonstrates the versatility of PGD from EDB in maintaining replication
integrity across both single-region and multi-region PostgreSQL clusters. The results show that
while single-region architectures benefit from minimal replication lag (averaging 2.36 ms), multi-
region setups naturally introduce higher latencies (averaging 29.25 ms), especially when
workloads are heterogeneous, as seen with the use of pgbench and HammerDB.

Despite the differences in performance between the two architectures, PGD ensures reliable
replication across nodes, making it an excellent solution for a wide range of use cases, from
local high-performance systems to globally distributed databases requiring high availability and
disaster recovery capabilities.

Key findings:

Single-Region Setup: Ideal for low-latency, high-throughput environments where geographic


proximity minimizes replication lag.
Multi-Region Setup: Offers robust disaster recovery and global availability but introduces
network-related replication lag, especially under complex workloads like those generated by
HammerDB.

Ultimately, PGD’s ability to maintain data consistency and minimize replication lag, even in
geographically distributed systems, highlights its value in modern, distributed database
architectures.

stormatics.tech 16
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

8. Recommendations
Based on the results of this benchmark, we recommend the following:

8.1 Single-Region Deployments

For businesses requiring minimal replication lag and high transaction throughput, a single-region
architecture using PGD is optimal. The observed low lag ensures real-time data consistency,
making it suitable for applications where immediate failover and fast data replication are critical.

8.2 Multi-Region Deployments

For global applications that prioritize disaster recovery and high availability across regions, the
multi-region architecture with PGD offers a robust solution. While replication lag is higher due to
network delays, it remains within an acceptable range for most global-scale applications,
especially those that can tolerate asynchronous replication delays.

8.3 Workload Considerations

Understanding the workload profile is crucial. If your application relies on simple OLTP
transactions, pgbench-like workloads will cause minimal lag even in multi-region setups.
However, if your application involves complex transactions and resource-heavy queries, as
simulated by HammerDB, expect some lag spikes in multi-region setups. Fine-tuning PGD’s
replication configuration, network optimizations, and careful consideration of workload
distribution across regions can help mitigate these spikes.

stormatics.tech 17
Replication Lag in PostgreSQL using Single Region and Multi-Region Architectures

Why Choose Stormatics?


Stormatics is a specialized consulting firm dedicated to helping businesses scale
PostgreSQL reliably for mission-critical data.

As your business grows, your database's performance, availability, and compliance can
start to lag. These bottlenecks can lead to costly downtime, frustrated users, and
missed opportunities. Left unchecked, these issues can threaten your entire
application stack and damage customer trust. While many providers push generic
solutions, we tailor our services to your unique needs for PostgreSQL.

At Stormatics, we bring specialized expertise directly to where it’s needed most,


offering targeted solutions without forcing you onto unfamiliar platforms. Our
approach ensures you get the precise help you need exactly when you need it, so your
database never holds your business back.

Reliable Solutions
You want your database to be reliable, we have the expertise to make sure you can
depend on PostgreSQL at scale.

Customer Satisfaction
Our customers are our biggest asset, and our focus is on excellence in service
delivery. We will not rest till you are 100% satisfied.

Customized Services
Your challenges are unique, and so are our services. Our team collaborates with
yours to deliver customized solutions.

We are your trusted PostgreSQL experts

Book a call with us today

stormatics.tech

You might also like