1.1. Features #
pg_probackup3 includes all the key functionalities of the prior versions of the pg_probackup utility. Some less popular features may be missing at the moment, but will be implemented in the future.
As compared to pg_probackup, pg_probackup3 comprises the following new features and improvements:
Version independence: The same pg_probackup3 version can now be used with different versions of Postgres Pro or PostgreSQL, ensuring compatibility and flexibility.
API integration: pg_probackup3 can be integrated with various backup systems via API, thus offering centralized management of the backup process.
Work without SSH: pg_probackup3 can work without an SSH connection, enabling more effective and secure data transfer.
FUSE
: pg_probackup3 introduces the newfuse
command, which enables running a database instance directly from a backup without requiring a full restore, using the FUSE (Filesystem in User Space) mechanism.Operation by unprivileged users: pg_probackup3 can be started by users who do not have access rights to PGDATA. This helps to increase security and reduce the risk of potential errors.
A new backup format: Each backup is now stored as a single file, making it easier to manage and store backups.
pg_basebackup support: In the BASE data source mode, it is now possible to leverage the pg_basebackup replication protocol for improved backup speed and efficiency.
PRO mode: pg_probackup3 introduces a proprietary replication protocol in the new PRO data source mode.
Merging incremental backup chains: It is now possible to save disk space by merging chains of incremental backups.
Completely reengineered core
Redesigned architecture
Improved performance
As compared to other backup solutions, pg_probackup3 offers the following benefits that can help you implement different backup strategies and deal with large amounts of data:
S3 support for storing data in private clouds using MinIO object storage, Amazon S3 storage and VK Cloud storage: available when using pg_probackup3 with Postgres Pro Enterprise. Backup data is transferred to and from S3 without saving it in intermediate locations thus eliminating the need of having a large temporary storage.
Tape ready: pg_probackup3 supports working with tape storage backup systems.
NFS v3 and v4 support: pg_probackup3 allows storing backups in the network file system.
Incremental backup: With three different incremental modes, you can plan the backup strategy in accordance with your data flow. Incremental backups allow you to save disk space and speed up backup as compared to taking full backups. It is also faster to restore the cluster by applying incremental backups than by replaying WAL files.
Retention: Managing WAL archive and backups in accordance with retention policy. You can configure retention policy based on recovery time or the number of backups to keep, as well as specify time to live (TTL) for a particular backup. Expired backups can be merged or deleted.
Parallelization: Running
backup
,restore
,merge
,delete
, andvalidate
processes on multiple parallel threads.Remote operations: Backing up Postgres Pro instance located on a remote system or restoring a backup remotely.
External directories: Backing up files and directories located outside of the Postgres Pro data directory (
PGDATA
), such as scripts, configuration files, logs, or SQL dump files.Backup catalog: Getting the list of backups and the corresponding meta information in plain text or JSON formats.
Archive catalog: Getting the list of all WAL timelines and the corresponding meta information in plain text or JSON formats.
Partial restore: Restoring only the specified databases.
Integration with other applications enabled by the API provided by the
libpgprobackup
library.
To manage backup data, pg_probackup3 creates a backup catalog. This is a directory that stores all backup files with additional meta information, as well as WAL archives required for point-in-time recovery. You can store backups for different instances in separate subdirectories of a single backup catalog.
Using pg_probackup3, you can take full or incremental backups:
FULL backups contain all the data files required to restore the database cluster.
Incremental backups operate at the page level, only storing the data that has changed since the previous backup. It allows you to save disk space and speed up the backup process as compared to taking full backups. It is also faster to restore the cluster by applying incremental backups than by replaying WAL files. pg_probackup3 supports the following modes of incremental backups:
DELTA backup. In this mode, pg_probackup3 reads all data files in the data directory and copies only those pages that have changed since the previous backup. This mode can create read-only I/O load equal to that of a full backup.
PTRACK backup. In this mode, Postgres Pro tracks page changes on the fly. Continuous archiving is not necessary for it to operate. Each time a relation page is updated, this page is marked in a special PTRACK bitmap. Tracking implies some minor overhead on the database server operation, but speeds up incremental backups significantly.
Warning
After promoting a standby server to primary and switching timelines, take the first backup in either FULL or DELTA mode. Using other backup modes in this case may result in data corruption.
pg_probackup3 can take only physical online backups, and online backups require WAL for consistent recovery. So regardless of the chosen backup mode (FULL or DELTA), any backup taken with pg_probackup3 must use the following WAL delivery mode:
STREAM. Such backups include all the files required to restore the cluster to a consistent state at the time the backup was taken. Regardless of continuous archiving having been set up or not, the WAL segments required for consistent recovery are streamed via the replication protocol during backup and included into the backup files. That's why such backups are called autonomous, or standalone.
ARCHIVE. Such backups rely on continuous archiving to ensure consistent recovery. This is the default WAL delivery mode.
In pg_probackup3 there are the following modes of backup data sources:
DIRECT. Does not use any replication protocol.
BASE. Uses the pg_basebackup protocol.
PRO. The default mode that uses the pg_probackup3 protocol.