0% found this document useful (0 votes)
169 views105 pages

PostgreSQL DBA Guide

The document discusses installing and configuring PostgreSQL on Linux. It covers downloading PostgreSQL packages, initializing a database cluster using initdb, configuring the data directory and superusers, starting and connecting to the cluster, and PostgreSQL configuration files.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
169 views105 pages

PostgreSQL DBA Guide

The document discusses installing and configuring PostgreSQL on Linux. It covers downloading PostgreSQL packages, initializing a database cluster using initdb, configuring the data directory and superusers, starting and connecting to the cluster, and PostgreSQL configuration files.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 105

2.

installing PostgreSQL on linux

downloading package

configuring PostgreSQL post installation

installation and data directory in Linux

create superuser for PostgreSQL

make sure the services are running

setup environment variables

below steps are done on on rocky Linux


RHEl package management

downloading package

In the realm of Red Hat, we employ RPMs (Red Hat Package Manager) as a fundamental tool. RPM
files serve as the architects of software repositories, facilitating the seamless installation of software
packages. The process unfolds as follows:

1. Selecting the Operating System (OS) Version: This step is akin to choosing the foundation upon
which our digital infrastructure will be built.

2. Utilizing the Yum Repository Method: We opt for the Yum repository method, a robust approach akin
to gaining access to a comprehensive software library.

3. PostgreSQL Version Selection: A pivotal decision arises as we must specify the PostgreSQL
version. For our purposes, let's opt for PostgreSQL version 12.
use the below command wget to download file directory from website
wget [link to download ]
wget [https://yum.postgresql.org/packages/#pg12]
or use the below script which you can get from website its self

sudo dnf install -y https://download.postgresql.org/pub/repos/yum/reporpms/EL-9-


x86_64/pgdg-redhat-repo-latest.noarch.rpm sudo dnf -qy module disable postgresql sudo
dnf install -y postgresql12-server sudo /usr/pgsql-12/bin/postgresql-12-setup initdb
sudo systemctl enable postgresql-12 sudo systemctl start postgresql-12

on the script you will download rpm repo package


disable the PostgreSQL module available on redhat (only redhat)
sudo dnf -qy module disable postgresql This command ensures that the PostgreSQL
module will not be active or installed on the system. It can be useful when you want to prevent the
installation or use of a specific PostgreSQL module or version
/usr/pgsql-12/bin/postgresql-12-setup : This part of the command specifies the path to the
PostgreSQL version 12 setup utility. It's a script or program responsible for configuring and
managing various aspects of PostgreSQL, including initializing the database cluster.
initdb : The "initdb" command is short for "initialize database." When executed, it creates a
new PostgreSQL database cluster, which is the core directory structure and configuration files
required to run a PostgreSQL instance. It sets up the essential components and directory structure
for a fresh database installation.
In summary, the provided command, "sudo /usr/pgsql-12/bin/postgresql
setup initdb," is used to initialize a PostgreSQL version 12 database cluster on the system with
elevated privileges, creating the necessary infrastructure for PostgreSQL to operate effectively.

additional package we need to install postgree-contrib , this will include all additional features
to locate this the syntax for package use yum list postgresql12*
configuring PostgreSQL post installation

During the installation of PostgreSQL on a Linux system, you may have noticed that the installation
process lacks certain user interactions that are typically associated with software installations. Unlike
some other software, PostgreSQL installation on Linux doesn't ask you for important details such as the
data directory, installation directory, port number, or even the PostgreSQL password.

installation and data directory in Linux

the installation directory will be in usr/pgsql-12/bin/

the data directory will be locating in /var/lib/pgsql/

create superuser for PostgreSQL

to check the superuser for PostgreSQL is configured


type su - postgres
if it dint ask you for password means the password was not setup for PostgreSQL

you cannot login to PostgreSQL using normal only root can access
then once you login just Chang the password using passwd command using the root user

make sure the services are running

make sure the below command is installed

sudo systemctl enable postgresql-12 sudo systemctl start postgresql-12

setup environment variables


first login using postgres user

the use VIM to edit the

the installer has already specify the pgdata path for the user

you just need to specify the path for binary to be used by user
PATH=$PATH:HOME/BIN export PATH export PATH=/usr/pgsql-12/bin:$PATH
when you want to use deterrent user then you will have to edit environment path for that user to .
3. database cluster

what is database cluster

initdb

data directory path

initdb syntax

starting the cluster

how to connect to the cluster

how to check the status of cluster

shutdown option

syntax

reload vs restart

reload & restart syntax

pg_controldata

syntax
see data directory

PostgreSQL+Cluster.docx

what is database cluster

database cluster is collection of database managed by single instance


don't get confused by cluster that tied up by high availability

initdb

initdb : is utility used to create database cluster


initdb create setup directory called (data directory ) where we store our data .

initdb need to be installed , and once installed you have to initialise storage area

data directory path

typically the location of data directory in Linux will be

/var/lib/pgsql/data/

initdb syntax
you have to execute initdb using the postgres user
syntax:
initdb -D /usr/var/lib/pgsql/main

pg_ctl -D /usr/var/lib/pgsql/main

-D this option refer to data directory


-W we can use this option to force the super user to provide password before initialise DB
before you start make sure add biniray of postgreasql commmand
export PATH=/usr/lib/postgresql/13/bin/:$PATH

to test this we will create directory in root folder called /postgresql/data/

and we will assigned the permission

chmod 770 /postgresql/data/

the we will run the command

if get this error means the initdb is not added in environment of OS

usually the bainiray are located at ls /usr/lib/postgresql/13/bin/

to run the command you will have to put the full path of initdb binary
with follow up with the option and directory
make sure that command is not executed using the root user
if you ran to the below issue you also have to change the owner of the folder using chown
chown postgres:postgres /postgresql/data/

then try to run the command again and it should execute correctly
/usr/lib/postgresql/13/bin/initdb -D /postgresql/data/

now go to the file you created and you will see there files created on new directory

starting the cluster

once we created the cluster it wont start automatically , we need to start the cluster using the command
same issue as INITDB command you have to give the full path of the command or edit the environment
to include the pg_ctl path
pg_ctl start
/usr/lib/postgresql/13/bin/initdb -D /postgresql/data/
if will ran to error as showing below , because the cluster is using the same port 5432 as default cluster

you need to change it to random one


edit the postgresql.conf using nano and edit the port and enable listening to local host

now if you ran the command again it will execute successfully

how to connect to the cluster

to connect to the new cluster we have created use the below syntax , and make sure to provide the
correct port for the cluster
psql -U postgres [or any user you want] -p 5444[port number]

\q

how to check the status of cluster

you can check if the cluster is runinng by using the below command
/usr/lib/postgresql/13/bin/pg_ctl status -D /postgresql/data/

shutdown option

below command is related to shutdown the cluster


1- smart : the server disallow new connection , but the existent session work normally , its shutdown
only after all session terminated.
2- fast (default) : the server disallow new connection and abort their current transaction and exit
gracefully.
3- immediate : quits/abort without proper shutdown which lead to recovery on next startup

syntax

for the propose i will use smart option in shutdown the new cluster we have created , for the rest of
shutdown option is basically the same .

pg_ctl stope -m smart[also you can uses immediate or fast] -D[data directory]

reload vs restart
reload is used when we did some changes on configuration files to , reload only load new config with
restarting the services .
Some changes in config wont reflect unless we restart the services
restart : gracefully shutdown all activity , close all open files and start again with new configuration .

reload & restart syntax

reload : system reload postgresql-11


restart : systectl restart postgresql

pg_controldata

its used to provide information about cluster


such as version number , last checkpoint

syntax
pg_contrldata /var/lib/pgsql/main

see data directory

to see the data directory connect psql client and use the below command

show data_directory;
4- PostgreSQL Directory layout

installation directory layout

bin folder

data

log folder

installation directory layout

PostgreSQL is typicality installed to /usr/lib/postgresql/


the same directory structure is the same both on windows and Linux.,
now we will discssed the content of instalation

bin folder

here you will find all PostgreSQL utility files such as initdb a& bg_ctl

data

the folder is located in default cluster and might change depend on if there mulitable cluster created ,
but by default its located in /var/lib/postgresql/13/main

here you will find all data related to database and the configuration file for the cluster

log folder

the log of the PostgreSQL is very important it can support you to troubleshoot the status of PostgreSQL

the log are typically stored in below path


/var/log/postgresql/
5- configuration files

postgresql.conf

check parameter of PostgreSQL inside pgSQL console

view parameter using pg_setting

view history of changes using pg_file_setting

postgresql.autoconf

example 1 , edit of the parameter work mem

check if changes require restart from psql cli

does change in parameter reflect on postgresql.conf file .

reset all

pg _ident.conf
pg_hba.conf

postgresql.conf

file contain parameter to help configure and manage performance of the database server
when we run initd command to create cluster it create postgresql.conf file default

the file allow one parameter per the line format

parameter that require restart are clearly marked on the file


many parameter need a server restart to take affect

the file is located in data directory of the cluster

default is located in /etc/postgresql/13/main/


check parameter of PostgreSQL inside pgSQL console
the parameter you saw in the file can also be viewed in the psql console itself
to check and verified whether they enabled or not , and what is the parameter set for it
for example in the file there parameter called max_connection , i can see the value of it inside psql
console by using show max_connection
view parameter using pg_setting

pg_setting is table contain all the infromation about config setup of postgresql.conf
you can query it to get parameters set
for example the below query will show parameter that recently eddied and pending restart to apply

select name ,source,boot_val,pending_restart from pg_settings where name =


'max_connections';

to get the list of column in pg_setting we can use the below option , useful in case you want to build
query for yourself

\d pg_setting
select name , setting , category , boot_val from pg_settings where sourcefile =
'/etc/postgresql/13/main/postgresql.conf';

view history of changes using pg_file_setting

this table contain log of any changes done on postgresql.conf file

postgresql.autoconf

this file have parameter that allow to modify the postgresql.conf parameter from cli rather then editing
the file itself

the editing of postgresql.conf is very critical and better to edit it from psql command line using alter
command
keeping in mind some parameter require restart to reflect in the postgresql

example 1 , edit of the parameter work mem

this parameter we will edit it form pgSQL cli

to view the current value use the show command and the parameter
syntax

show [paramter in postgresql.conf];


show work_mem ;
to alter command we use alter system set
syntax :
alter system set [parameter] = '[new value ]'; alter system set work_mem='10MB';

check if changes require restart from psql cli


as mentioned before some changes require restart , for instance work_mem dosnt require restart .
to check if the change require restart go to postgresql.conf file and in there you will file comment 'restart
required '

or from cli itself you can check it by using the below query .

select * from pg_file_settings;


applying column is showing 't' meaning the changes is applied
there is column called error if its empty means no restart required if there s enters in the column means
restart is required .

does change in parameter reflect on postgresql.conf file .

we have changed the parameter fo work_mem from 3 to 10 let's check if its reflected on postgresql.cnf
file
the value hasn't changed why ?
acuity the changes is not edited in postgresql.conf file , it will be available in postgresql.autoconf
if you checked it you will find the changes you did for work_mem
the file will be located in the data directory of the db-cluster
/var/lib/postgresql/13/main/
if you check the content of the file you will find the new value for work_mem

when postgresql services start it will load config from postgresql.conf

then it will load config in the postgresql.auto.conf

postgresql will load the value in postgresql.auto.conf and avoid load the value in postgresql.conf

reset all

if you want to reset all the changes you did in psql cli using alter system
command
you can use the below query

alter system reset all ;

this means all values in postgresql.auto.onf will be removed


but keep in mind it will not reset values in postgresql.conf
only changes done using alter system command will be reset

in short postgresql.auto.conf is useful to edit config of PostgreSQL without touching the


postgresql.conf file

pg _ident.conf

the file is part of authenticating section file of PostgreSQL and will allow to map OS user with
PostgreSQL user
this file allow to match the user in PostgreSQL database with username in OS level

any changes to the file require a reload only


the file is located in /etc/postgresql/13/main/

i have PostgreSQL user called ahmed , but the user is not there in OS so i will create it and map it to
user called postres in PostgreSQL

in the file you will asked to give friendly name for map , then add the OS username in Identity , then
postgresql_username add PostgreSQL username
after that you need to reload PostgreSQL

pg_hba.conf

enables clients authentication between PostgreSQL server and the client application

HBA means host based authenticating .


6. create object (database/user/schema)

creating database

createdb utility

create database from psql

checking the OIDs

connect to database directory

check what database you connected to !

drop database

from psql
dropdb utility

create user

create user using createuser utility

using --interactive

create user from psql

drop user

from command line

from psql console

privilege

cluster level privileges

revoke superuser privilege


grant a role to a user

object level privileges

Create.docx

creating database

in PostgreSQL you can create database through ways

1. through command line using createdb

2. through psql using create database [database_name];

createdb utility
syntax
createdb [database name]

you can use --help to check for more option

to use the command you need to witch to postgres user


or use -U option and create database

1.using -U option
i will create database called asus i am login in using root user
so the syntax is below

~$ createdb -U postgres asus

2. using postgres user


if you switch to postgres user no need to use -U option directy isssue the command

~$ createdb lenovo
create database from psql

the syntax :

create database [database_name] owner [username];

if you don't specify the owner then automatically the owner will be the user that issue the command .

checking the OIDs


once you create the database it will get the OID , to heck the OID of the database use the below query
select datname,oid from pg_database;

connect to database directory

you can connect directory to database by using the below syntax

$ psql -U postgre [databse_name]

check what database you connected to !

you can use the below option and it will display the connection infor including data directory and port

\conninfo

drop database

to drop the database same as create you can do it from command line .

from psql
#drop database [database_name]

dropdb utility

this utility allow you to drop database from the command line

root$dropdb -U postgres [database_name]


postgres$ dropdb [database_name]

create user
while creating the user it important to note that username must be unique
the username should not start with 'pg_'
the user postgres is created during the installation automatically

it hold the privilege of 'super user ' allow it to users with role privileges
the user postgres has all the privilege with grant option .

only super user can create users

to create user you have to way similar to create database

1. from command line using createuser utility

2. from psql console

create user using createuser utility

as usual you can use --help to check deferent option you can use
for this porpoise i will create user which not super user and i will specify password for it

so the syntax is
root@PostgreSQLSTG ~# createuser -U postgres -P -S [username]
postgres@PostgreSQLSTG ~# createuser -P -S [username]

using --interactive

for more interactive opting using createuser utility you can specify --interactive option and it will ask
you for deterrent parameter to which you have to specify and this option for ease of use

createuer --interactive
create user from psql

syntax

create user [username] login [privilege] password [enter the password ];

create user john login superuser password '123456789';

drop user

there two way from command line and from psql

from command line

we will use the dropuser command

$ dropuser -U postgres [username]


from psql console

drop user [username];

privilege

privilege is right to execute particular SQL statement , or a right to access another user objects .

there two type of privilege


1.cluster level privilege are granted by superuser .
2.object level privileges are granted by superuser or the owner of the object or someone with grant
privilege

cluster level privileges

when you issue the \du in pgSQL you will see the users with cluster level privileges

i have user name Ahmed i want to revoke his superuser

revoke superuser privilege


to revoke the privileges we will use the below syntax

alter user [uer_name] with nosuperuser;

to grant him the superuser privilege again


we will use the below syntax

alter user [username] with superuser ;

grant a role to a user

for instance we want to grant user createdb privilege

syntax

alter user ahmed createdb;


object level privileges

first i will connect to database directory

when you create table the owner will be the user who created the table other user will not be able to
select from the table so to solve this you will have to grant select privilege to the user

grant select on [table_name ] to []


7.table inheritance , table partion

table inheritance

updating values in parent table

copy table

table inheritance

To convey the idea that child tables inherit all columns from the parent master table while also allowing
child tables to have additional fields, with the ability to specify the "ONLY" keyword for queries on
specific tables, you can rewrite the statement as follows:

"We enable child tables to inherit all columns from the parent master table, permitting them to include
additional fields unique to each child table. Furthermore, the 'ONLY' keyword can be utilized to specify
that a query should exclusively target a particular table and not affect any others.

Please note that any updates or deletions made to the parent table will automatically propagate to the
child table due to their inheritance relationship.

for testing i have created a new database called testinhernttable


then i will create table called orders .

create table orders(orderno serial, flightname varchar(100),boarding


varchar(100),status varchar(100),source varchar(100));

We will create a child table called 'online.book' and specify its inheritance using the INHERITS syntax.
inherits(partent_table_name)

create table online_booking (price int) inherits(orders);

To check the description of the child table and verify its inheritance from the parent "order" table, you
can use the following phrase:

\d online_booking

"We will inspect the child table's metadata and confirm whether it inherits from the parent 'order' table."
you will also notice that child table inheritance all the column from parent table

we can use the \d+ order on parent table and it will give more info on parent table and also will show
if there child table

now we will insert data to the table

insert into orders(flightname,boarding,status,source)


values('aircanada','xyz','ontime','employees');
insert into online_booking(flightname,boarding,status,source,price)
values('nippon','chn','ontime','website',5000);
insert into online_booking(flightname,boarding,status,source,price)
values('luftansa','chn','ontime','app',3000);
When querying the parent table, it will display values from both the parent and child tables. However,
any columns that were added exclusively by the child table and are not inherited from the parent will not
be included in the query results

updating values in parent table

in child table you can update with no requirement , but for updating parent table it require to be caution
due to fact that any changes will reflect on child

to only update use only in syntax

update only order set status='delayed';


copy table

copy is used to copy structure of table alone with the data to another table

the way you implement this is by creating new table and using as table followed by existing table name
below is the syntax with data

create table new_table as table exesting_table;

below is syntax without the data

create table new_table as table exesting_table with no data ;


8. tablespace

default tablespaces

pg_default

pg_global.

creating tablespace.

move table between tablespaces

check the new path of the table

drop tablespace

Temporary Tablespace in PostgreSQL

creating temporary tablespace

PostgreSQL store data logically in tablespace and physically in datafiles


PostgreSQL uses a tablespace to map a logical name to a physical location on disk

tablespace allow the user to control the disk layout of PostgreSQL


when we create table or any object it goes directory to data directly by default

to avoid load on disk we can separate the data been stored in disk to be segregated to deterrent part of
the disk.

default tablespaces

default comes with two out box tablespaces namely pg_default & pg_global .
the default location for tablespaces is data directory.

pg_default

is the default tablespace for storing all user data .


is also the default tablespace for tamplate1 & tamplate2 database

al newly created database uses pg_default tablespace , unless overridden by TABLESPACE CLAUSE
while create database

pg_global.

is the default tablespace for storing global data .

creating tablespace.
to show the tablespace that are in the db_cluster you can use the below command
select * from pg_tablespace ;

to create new tablespace first create directory where the tablespace will point the data to be saved to

syntax:

create tablespace [tablespace_name] location '[directory]'

create tablespace table1 location '/tablepace1';

when creating new directory you you will need to add required permission

chmod 700 [path]


chown postgres:postgres [path]

now when you want to create any new object such as table or database or even index you can specify
the table space that will store the object

[query for creating new object] tablespace [tablespace_name];


create database testtablespace1 tablespace table1

if you check the directory of the tablespace you will find there is new file created
move table between tablespaces

first we will check the table are assinge to which tablespace

select * from pg_tables where tablename = 'staff';


to move table from tablespace to another we use the below syntax
first i will check in which table one target table is assinge to

the table store is not assinge to any tablespace so if you found the tablespace column empty it means
the table is assinge to the default tablespace
alter table [table-name] set tablespace [tablespace-name];
alter table store set tablespace table1;

check the new path of the table


we have successfully move the table from default tablespace to new tablespace meaning that the table
data is not located in default data path of PostgreSQL
to check and confirm that we can use the below query.

select pg_relation_filepath('store');

drop tablespace

note that you cannot drop tablespace if there is object in the tablespace
the syntax :

drop tablespace [table_space_name];

Temporary Tablespace in PostgreSQL

PostgreSQL create temporary tablespace for such action you require for completing a query example
sorting query
temporary tablespace deosnt store any data and they are remove after you shutdown database .

creating temporary tablespace

syntax:
create tablespace [tablespace-choosen-name] owner [username] LOCATION
[filepath]

the owner parameter is not require to be added , incase you didn't specify the owner the temp
tablespace owner will be PostgreSQL .

after creating temp tablespace there is one changes must be done in PostgreSQL configuration files
there parameter called temp_tablespaces where you have to specify temp tablespace name .
9. backup and restore

type of backup

1. logical backup

2. physical backup

logical backup

taking logical backup

taking backup of the entire cluster

How to Compress and Split Dump Files

restore logical backup using psql


restore logical backup pg_restore

restore only single table.

physical backup

offline backup

online physical backup

why i need to copy the wall files to another location


how to setup wall archiving

test the wal level arching

taking full online full backup using low level api


online backup using base_backup

type of backup

1. logical backup

A logical backup refers to the process of converting the data in a database into a straightforward text
format. This typically involves creating scripts of the database, table, or database cluster, which are then
saved as SQL files.

The scripts for tables and databases are composed of SQL statements. By executing these SQL
statements, the tables and databases can be recreated

2. physical backup
A physical backup entails duplicating the actual files used for the storage and retrieval of the database.
These backups are typically in binary form, which is not human-readable.
Physical backups can be categorized into two types:

Offline Backup: This type of backup is performed when the system is shut down. During this time, a
backup of the data directory is taken.

Online Backup: This is conducted while the system is operational and running, with users actively
connected to it.

logical backup

taking logical backup

To take a logical backup in PostgreSQL, you can use the built-in utility pg_dump. The syntax for using
this utility is as follows:

pg_dump -U [username] -d [database_name] >


[path_to_backup_file]/backup_file_name.sql

Replace [username] with your PostgreSQL username, [database_name] with the name of the database
you want to back up, and [path_to_backup_file]/backup_file_name.sql with the path and name of the file
where you want to store the backup.
For example:

pg_dump -U postgres -d dvdrental > /share/dvdrental_backup.sql

In this example, postgres is the username, dvdrental is the database name, and the backup will be
stored in the specified path /share/dvdrental_backup.sql. Make sure that you have the necessary
permissions for the folder where you intend to store the backup. If you don't specify an extension for the
file, it will be automatically saved with the .sql extension, which is the standard for SQL files.

if you check the file content by using any prefers text editor such as less you will find SQL statement
only on the file .
taking backup of the entire cluster

To perform a logical backup of a PostgreSQL database cluster, you can use the built-in utility
pg_dumpall . This utility is designed to back up all the databases in the cluster, and the user executing
pg_dumpall needs to have full superuser privileges.

The syntax for using pg_dumpall is as follows:

pg_dumpall -U [superuser_username] >


[path_to_backup_file]/backup_file_name.sql

Replace [ superuser_username ] with the username of the superuser and


[path_to_backup_file]/backup_file_name.sql with the path and name of the file where you want
to store the backup.
For example:

pg_dumpall -U postgres > /share/dbcluster_backup.sql

In this example, postgres is the superuser, and the backup of the entire database cluster will be
stored at the specified path /share/dbcluster_backup.sql. Remember to ensure that you have the
necessary permissions for the folder where the backup file will be stored. By default, if you don't specify
an extension for the file, it will be saved with the .sql extension.

While running pg_dumpall for a PostgreSQL database cluster backup, it's normal for the utility to
prompt for the password multiple times. This happens because it requests the password for each
database in the cluster.

This approach is suitable for smaller databases due to its straightforward nature. However, it is not
recommended for large databases as it can be time-consuming to generate the backup. For larger
databases, more efficient methods or tools that can handle large volumes of data more effectively might
be preferable.

How to Compress and Split Dump Files

In the case of logical backups, where the dump file can become quite large, there are strategies to
manage the file size and storage requirements:

1.Compression: You can compress the dump file to reduce its size. This is particularly useful when
dealing with large databases, as it can significantly decrease the space needed for storage. Most
compression tools can reduce the file size substantially, making it easier to handle and store.
syntax :

pg_dumpall | gzip > [filepath]/[backupfilename].gz

example:

pg_dumpall | gzip > /share/clusterall_backup.gz


by doing compression you can see the big deterrent in size between the original dump file and
compresses dump file .

2- Splitting the File: If you have limited space in your operating system or wish to distribute the storage
of the dump across different partitions, you can split the dump file into smaller parts. This approach
allows you to manage storage more effectively, especially when dealing with constraints in disk space or
organizational policies on data storage.

To split the output of pg_dumpall into smaller files, you can indeed use the split command in
Unix/Linux systems. The -b option allows you to specify the size of each split file. You can specify the
size in kilobytes (k) , megabytes (m) , or gigabytes (g).

For example, if you want to split the dump into 1KB chunks, you would use the 1k option. Similarly, you
can use m for megabytes and g for gigabytes.

The syntax for this operation would be as follows

pg_dumpall | split -b 1k - [filepath]/cluster_backup

pg_dumpall | split -b 1k - /share/cluster_backup


restore logical backup using psql

to test the process i have go ahead and drop database dvdrental and then i will attempt to restore the
database

we can use psql for restoring the database from backup


before we restore database we have to create empty database
syntax:
psql -U postgres -d dvdrental < /shar/dvdrental_bacup

restore logical backup pg_restore


pg_restore is used to restore PostgreSQL database from archive created by pg_dump in one of the non
plain text format
meaning we need to create custom dump with certain format so that it can be supported by pg_restore.

to create custom dump use the below syntax

pg_dump -U postgres -d[database_name]dvdrental -Fc > /share/dvdrental.dump

the file is in binary format not understand by the human .

restore only single table.

in this case i will drop a table and only restore this table , this one of common scenarios you will
encounter during production .
command to restore table

syntax:

pg_restore -t [table-name ] department -d [database-name] empolyee -U


postgres /share/dvdrental.dump

physical backup

offline backup

in here the database server must be offline in order to take backup


this type of backup is useful when we want to make changes to database directory and we want to
revert back if there is any flaw in our implementation .

its important to note that when we restore the database the PostgreSQL server must be shutdown
during the restore .

partial restore or singe table restore is not possible because we are backup the entire data directory and
when we restore we are going to restore the entire data directory
to start i have two dbcluster
using the below command we can stop ne of them

pg_lscluster
pg_ctlcluster 13 ahmed stop

syntax for taking offline back is as follows

tar -cvzf [filename]data_backup '[data directory path]'

the backup will happen in folder you are currently in so better to change you directory to folder you want
to store the backup

online physical backup

In this type of backup the system available and online for the user to do there
In background continuous backup is taken
In postgresql we use a continuous archiving method to enforce online backup.

Wall files : similar to MS SQL log file , here the transaction gets written on wall files upon commit before
they are written to the data file .

This is done to ensure that in case of a crash we can recover the system using the wal files .

Archiver : archiver role is to copy the wall files to another safe location
To achieve online backup we will use continuous archiving of wall files or in better context continuous
copying of wall files to safe locations .

why i need to copy the wall files to another location

In case of a disaster assuming i have full backup on sunday and system crash monday at 10:00AM .
And you need to recover the system till 10 Am because you don't want to lose data .
In this case you will need the full backup taken on Sunday plus all archived wal files.
This method of restore is called point in time recovery.

how to setup wall archiving

1- Enable wal archiving in the postgresql.conf files


2- We have to make base backup which is our full backup

Login to postgres and check if archive mode


Using the below command
Show archive_mode;
You can see the wall archiver is off still not configured

To configure wall archiving sop the cluster


Pg_ctlcluster 13 ahmed stop
No we need to make changes in the postgresql.conf file
Vi /etc/postgresql/13/ahmed/postgresql.conf
Look for parameter called ‘wal_level and uncomment the line

Also look for parameter ‘archive_mode’ uncomment and change it from off to on

The look for archive command parameter


Before you do you need to make a directory where the wall files will be restore and ensure the correct
permission are given
Chmod 700 [directory]
Chown postgres:postgres [directory ]
Now back to archive_command , uncomment the line in the ‘’ add the below command
Cp -i %p [the directory you want to store the wal files ]/%f
Cp -i $p /share/archive/%f
Once done start the cluster and check if archiving is enabled
By going to psql console and check the status using the below command

Show archive_mode;

test the wal level arching

We can use the which just inform postgresql that i am going to start a backup
note : this is not real backup
Select pg_start_backup(‘database_name’);

from the restult you can find one row copyied


you can use \! [bash command] to run os bach command directorly from psql console .

from the reslut there is one line is copyed


lt's stop the command by usin the below

select pg_stop_backup();
for now we have archive wall for certain database from psql

taking full online full backup using low level api

we have setup archive backup , but you must keep in mind that archiver require full backup to be
avilable
without that archiver is totaly useless.
base_backup is very

full online backup means full backup is taken while system is online

there two to take full online backup


1-low level api
2- pg_basebackup

to take low level api backup use the below syntax


in pql console
select pg_start_backup('give lable',false , false)
the first false means you are asking postgresql to take its time for doing I/O andnot return the contol
to user imiddiatly .
the second false inform PostgreSQL to take non exeslusive backup.
now we will take tar o the entire data directory.
similar to offline backup but this time we will stop the cluster

tar -cvzf ahmed_clustr_backup.tar /postgres2/data/

now go back to psql console and stop the backup that we run

select pg_stop_backup('f');

we mentioned f because this non exclusive backup

online backup using base_backup

base backup is can be used for replication and point in time recovery
the base_backup don't put the database in backup mode but make the database accessible while the
backup is running .
base_backup cannot take single object or single table or singe database , insist it will only take full
backup of the entire DBCLUSTER .
syntax:

pg_basebackup -D [backup diretcory location]


pg_basebackup -h localhost -p 5433 -U postgred -D /share/backup/ -Ft -z -P
-X

-h this host in which where the backup will be store in my case is localhost but you can store it in
remote server
-Ft is for format you want
-z this will store the backup in gzip format
-P this will allow you to view the progress of the backup.
-Xs means you want backup of the entire database along with all transaction which are happening
during the time of the backup
10-Maintenance in PostgreSQL (course closure)

explain plan and query execution cost.

updating planner statistics / analyse

analyse command

real world examble

how to enable autovacum

vacuum freeze

vacuum

what is vacuum
what is vacuum full

how to check table size

how to vacuum database an remove data fragmentation.

turn off autovacuum

what happen when table is bloated


what happen when we vacuum the table

what happen when we vacuum full the table

routine reindexing

how to locate fragmented index

how to install pgstattuple extension


how to use pgstattuple

rebuild index

cluster

all database must have matininace ask to keep optimal performance

maintenance task are task performed regular


meaning we use automate these task
in Linux we use 'cronjob'
in windows we use ''task scheduler'.
PostgreSQL provide the following maintenance option.
1.updating planner statistics # analyse
2.Vacuum
3.routing reindexing
4.cluster
explain plan and query execution cost.

explain statement display execution plan chosen by the optimizer for


selec,update,insert,delete statement .
explain help us understand how mush time and how mush cost it will if we execute the query , explain
plan doesn't execute the query only just show execution plan .
syntax:
explain select * from <tablename>;

example ;
explain select * from actor;
the query will display the following
cost here means the cost of CPU and the number of pages read , the that i will calculated and the cost
is presented
the cost equation is as following = 'umber of page read * sequnce_page_cost*row count * CPU usage
time '
Cost Analysis:
the cost will display two value 1.intila cost , 2. actual cost
1.Initial Cost: This is the preliminary estimation of how resource-intensive the query might be.
2. Actual Cost: This value represents the actual resources consumed when running the query.
Row Details:

Rows Read: This indicates the number of rows that the query will read from the database.
Data Width:

Row Size: This represents the size or width of each row's data.

the explain cost will change depend on query for example the below query will have where clause
adding flirting to the result

explain select * from actor where actor_id =2;

you can see the cost reduce because we only checking one certain row
updating planner statistics / analyse

PostgreSQL query planner relays on statistical information about the contact of the table in order to
generate good plans for query .
If the statistical information about the table, including the number of rows or the number of columns,
remains static and is not updated, the query '''optimizer''' may struggle to create effective execution
plans. Keeping these statistics up-to-date is crucial for the optimizer to generate optimal query plans
that adapt to changes in the data distribution and size.
table in general get updated deleted new insert
so what's is going on in the table has to be update to the ''optimizer''
inaccurate or outdate statics may lead to optimizer chosen the poor execution plan which my lead to
database performance degradation

analyse command
analyse command collects information about row size , row count , average row size and row
sampling information
we can run Analyze command automatcily by enable autovaccum daemon(enabled by default) or
run the analyze command manually

real world examble

i have created this table

CREATE TABLE tel_directory (


ID VARCHAR(5),
Name VARCHAR(50),
City VARCHAR(50),
ZipCode INT
);
then i will disable auto '''vacuum'''
the command as follow .

alter table [tabe-name] SEt (autovacum_enabled=false);

to confirm if the table not under auto vacuum


use the bellow command

select reloptions from pg_class where relname='tabe_name';

I have upload the below insert file to server we will use psql to insert this

insert.sql

to insert we will use the below command

psql -U postgres -d [database-name] -f insert.sql


insert is completed I have run the command which count how many row are there

now we will see what's inofmration does optimize have about the table
for insist the row are stored in pages
let's see what the optimize know about number o pages and number of row

select relname,reltuples,relpages from pg_class where relname ='table_name';

result cam as follow:


-reltuples : number of row , cam as 0 optimizer but there is 5000 row
-relpages : number of pages came 0 but not possible since there are rows in the table

so let's see the quey execution plan for select * from tel_dirctory;
explain select * from tel_directory;
the execution plan came wrong its said it will read 864 row while there is 5000 row as shown in count
query .
this means that optimizer doesn't have any idea about the table.

now we will run the Analise command this now will tell the optimizer what are number row and the
pages

the command is as follow

analyze [table-name];

now we will repeat the same command


now let's see the statistics that optimizer know about the table .

select relname,reltuples,relpages from pg_class where relname ='table_name';

now the optimizer know that there is 5000 row and there 32 pages took to store the rows

now let's check explain and we will see the explain plan know the exist number of row
how to enable autovacum

the same command todisable only we wll put true

alter table [tabe-name] SEt (autovacum_enabled=true);

to confirm if the table not under auto vacuum


use the bellow command

select reloptions from pg_class where relname='tabe_name';

vacuum freeze

is a critical maintenance operation in PostgreSQL. It's specifically designed to mark rows as "frozen,"
which means they are no longer subject to being vacuumed or removed by the system. This process is
essential for ensuring the stability and performance of the database, particularly in scenarios where data
changes frequently

Here's why VACUUM FREEZE is important:

Transaction ID Wraparound Prevention: PostgreSQL uses Transaction IDs (XIDs) to track the age
and status of transactions. Since XIDs are finite and wrap around, there's a risk of a "transaction ID
wraparound" if the XID limit is reached. When this happens, it can lead to data corruption and
downtime. VACUUM FREEZE helps prevent this by recycling old XIDs, making them available for
reuse.
Performance Optimization: As rows become "frozen," they are excluded from regular vacuuming
processes. This reduces the overhead of vacuum operations and helps maintain consistent
database performance over time.

Example:
Imagine you have a PostgreSQL database that's been running for several years, with frequent data
modifications. The XID counter has been steadily increasing. If you don't perform VACUUM FREEZE,
you risk reaching the XID limit, which could result in a catastrophic database failure.
PostgreSQL uses the utility called vacumdb to vacum freeze tabel or row or entire database .

vacuum

the utility for vacuum is built in called vaccumdb


syntax for normal vacuum
vacuumdb

is also called bloat in PostgreSQL.

PostgreSQL doesn't UPDATE in PLACE OR DELETE a row directory form the disk .

meaning when you delete a record form table they are not actually delete they are just marked as
deleted and they are keep as old version

as the old version keep pilling up and they become absolute , this causes fragmentation and
bloating in the table .

so the tables are indexes will look the excite the same size even after you delete a lot of record .
to solve this we will use vacuum and vacuum full

what is vacuum

vacuum reclaim storage occupied by dead tuples/rows


there are two type of vacuum :
-vacuum:the utility simply reclaim the space and make it available for reuse , this is allow to free space
on table level but space will not be usable by OS , however you can do more insert on the table.
vacuum allow to reuse space in table but not to because by OS .

during vacuum there will not be exclusive luck on table , meaning we can do vacuum during
business hour and it will not effect performance .

frequently updated table are good candidate for vacuuming.

what is vacuum full

vacuum full rewrite the entire contents of the tables into new disk file with no extra space .

vacuum full release the space to OS occupied by delete records .

it reduce the size of table

vacuum full take a lot of time then vacuum ,because it going to rewrite the entire content of the table
into the disk again

vacuum full will place exclusive lock .

its reamended not to run full vacuum during bussnies hour


syntax for vacuumdb is
vacuumdb -f

i have deleted the follow amout of row .

let see the tablespace size

SELECT pg_size_pretty(pg_total_relation_size('your_table_name'));
or\dt [table-name]
you can see the size reduced

how to check table size

it important to able able to determine the size of the table

the command below will show the size


/dt+ [table-name]

\dT+ [ pattern ]
Lists all data types or only those that match pattern. The command form \dT+ shows extra information.

how to vacuum database an remove data fragmentation.

we have create table called tel_directory with 9000 row this current size of the table

\dt+ tel_directory

and this the count of rows in the table

turn off autovacuum

if vaccum is enabled by default we will not be able to test our size deferent when we do vacuum of full
vacuum when we delete row

to turn off autovacuum use the below command

alter table tel_directory set (autovacuum_enabled=false);


what happen when table is bloated

now back to our example we will delete record from table

delete from tel_directory where state in('TX','NJ','NY');

now 3002 record has been delete let's check the row count and size

the size didn't change but it increased , because the record are not removed form table they are marked
as old version , meaning they still occupy space from disk

now i will insert the record again


i have file called state.sql that contain all record we deleted we will insert it back to the table

state.sql

psql -U postgres -d dvdrental -f state.sql


let's check the size and row count

the size of the table increased but this now consider bloated (''data fragmented'') table containing record
that are there and record that are delete but not removed only marked as old .
the ''bloating'' where size of the table increased even though it holding the same amount of records

what happen when we vacuum the table

now I will drop this table and start again and create table and insert the record from insert.sql file
truncate will remove all record in the table

truncate table tel_directory;


select * from tel_directory;
now i will run the insert.sql again

i will delete the row same as before and check the count and the size after delete

delete from tel_directory where city in ('CA','FL','NJ');


now we will do normal vaccum and after that check the status , below is done in psql console

# vacuum(verbose) tel_directory

the option verbose will show what has been and status after the command , its optional not needed
you can use the below syntax if you would like

vacum tel_directory

the size has increased why ???


reason is that vacuum has deleted the rows but has reserved some space for the insert .

now we will insert the row in state.sql and check if the space changed

the space has not changed .

what happen when we vacuum full the table


same senario i will truncate the table then reinsert the row and then check the space
after that i will delete the row and use vacuum full instied of normal vacuum and check the space .

now I will delete the row and check the space

now we will do vacuum full in the psql

vacuum(full,verbose) tel_directory;

the size as shown below has decreased with huge margin compare to normal vacuum
vacuum full has gone ahead and deleted all dead rows and released the space to the OS

routine reindexing

insert and update and delete overtime fragments the index

a fragmented index will have pages where logica order based on key value differs from the physical
ordering inside the data file .

heavily fragmented index can degrade query performance because additional I/O is required to
locate data to which the index point
reindex rebuild as index using data stored in index table and eliminates empty space between
pages
syntax : reindex index <index-name> .

how to locate fragmented index

there internal veiw in PostgreSQL that will show statics about index , the view will show the extension
that you have to install in order to add more feature , wear looking for extension that will show table
statics

select * from pg_available_extensions;


the extension we are looking for is called pgstattuple

how to install pgstattuple extension

to install the extension we will go to psql console and type the below command

create extension pgstattuple

run the below command again to confirm the extension is installed

select * from pg_available_extensions;


how to use pgstattuple

to use the extension first check how many index you have in the database.

\di

now i will see what is the status of one of the index , whether this index is bloated or its fragmented .
or its working fine

select * from pgstatindex('[index-name]');

note : the extension must be installed on database you want to check the
index .
the most important thing here is leaf _fragmentation .
if the leaf fragmentation is about 50 this is considered as fragmented index

rebuild index

syntax :

reindex index [index-name]

cluster

cluster instruct PostgreSQL to cluster the table base on the index specified by the index_name

when table is cluster , it physically reorder base on the index information


cluster will organize the data physically according to a particular index in the data file .
we need to cluster because the user will not insert a data in a specific order.

cluster is one-time operation , when table receive consent changes , the changes will not be
clustered so you will have to to do cluster again

an exclusive lock is required , so cluster should be executed during peak hours .


cluster lower the disk access and speed up query .

syntax for cluster cluster [table-name] using [index_name];

now to demonstrate the cluster i will create table and i will insert row in random order .

create table ahmed (id numeric , name varchar(10));


i have inserted values but not in order even after we do select we will check the record are not in order

to cluster it i have to create index arrange base on id

create index cluster on ahmed(id);

now I will cluster the table using the index i have created , after that we will do simple query and check if
the record are in order after cluster .

cluster ahmed using cluster;

select * from ahmed;


now the rows are sorted base on index on data file
11. master-slave replication

prerequisite :

master slave hardware specs


installing PostgreSQL

Configuring Master

Configuring Slave

test master slave configuration

monitoring replication

prerequisite :

1. Ensure internet connectivity is established for the installation of PostgreSQL.

2. Nano or vim is required for editing config files


3. PostgreSQL, version 12, is to be installed on both nodes.

4. Administrative privileges (sudo access) are required for modifying configuration files.
5. Temporarily disable the firewall to facilitate the necessary configurations.

6. Root access is essential for instances where deletion of slave data files is necessary.

master slave hardware specs

we will be implement master-slave streaming replication using two VMS running ubuntu 22 jammy
both node will have PostgreSQL 12 installed , and configured with ip in same subnet
below are VM details .
1.postgresqlDB01 : 10.10.10.77
2.postgresqlDB02 : 10.10.10.78

DB01 - 10.10.10.77, PostgreSQL 12, Ubuntu 22 Jammy, Master

DB02 - 10.10.10.78, PostgreSQL 12, Ubuntu 22 Jammy, Slave

installing PostgreSQL

first check if PostgreSQL 12 is available in Ubuntu repositories using the below command
apt-cache madison postgresql-12
If PostgreSQL 12 isn't available in our current repository, the next step involves adding the PostgreSQL
12 repository to our Ubuntu machine. This requires importing the GPG key and then adding the
repository. To accomplish this, we'll use the following command.
Note: Internet access on the VM is a prerequisite for these steps

wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key


add -

next update the ubuntu repository to add links for PostgreSQL 12

echo "deb http://apt.postgresql.org/pub/repos/apt/ lsb_release -cs -pgdg main" |sudo tee


/etc/apt/sources.list.d/pgdg.list

We've successfully updated the repository. Now, we'll proceed with the installation of PostgreSQL 12.
The following command will be used for this purpose

sudo apt update sudo apt -y install postgresql-12 postgresql-client-12


once the installation is completed , next steps is to verify if the PostgreSQL services are running.

systemctl status postgresql

Configuring Master

Now we will create a database user “replication” who will carry out the task of replication.
switch to postgres user that will be automatically added to ubuntu once we installed PostgreSQL

su - postgres

if you are unable to switch to postgres user then reset its password using the below command

sudo passwd postgres


then enter PostgreSQL t-sql configuration suing psql

create a database user “replication” who will carry out the task of replication using the below command ,
also note down the password and username you will need it in the slave configuration

CREATE USER replication REPLICATION LOGIN CONNECTION LIMIT 1 ENCRYPTED PASSWORD


'123456789';

check if the role is created successfully by using the below command

\du

Next, we'll adjust the maximum number of connections permitted for the replication user. This is done by
executing the following command within the psql client.

ALTER ROLE replication CONNECTION LIMIT -1;

The configuration files for PostgreSQL are typically found in the following directory:
/etc/postgresql/12/main

We need to modify a file named postgresql.conf to configure this server as the master. While both nano
and vim are suitable editors for this task, I will be using nano for this purpose
Note : Make sure to use a user account with sudo privileges or the root account for these steps, as
administrative access is required to edit the postgresql.conf file.
sudo nano /etc/postgresql/12/main/postgresql.conf

Edit the following parameter in the postgresql.conf file. If the line is currently commented out,
uncomment it to activate the option
you can use Ctrl+w to search and go to desire line

listen_addresses = '*'
wal_level = replica
max_wal_senders = 10
wal_keep_segments = 64
once done click Ctrl+x then y , then hit enter

now Slave server need authentication for replication. Now append following line
to /etc/postgresql/12/main/pg_hba.conf file
sudo nano /etc/postgresql/12/main/pg_hba.conf
# Replace 10.10.10.78 with slave server's private IP host replication replication
10.10.10.78/24 md5
next steps is to restart postgresql services

systemctl restart postgresql systemctl status postgresql

We have finished all configuration in Master Server and our master server is ready for replication

Configuring Slave

for the slave its mandatory to first stop PostgreSQL services

systemctl stop postgresql systemctl status postgresql


next eddit the config file with the below parameter
sudo nano /etc/postgresql/12/main/postgresql.conf

listen_addresses = 'localhost,158.245.203.119' wal_level = replica max_wal_senders =


10 wal_keep_segments = 64 hot_standby = on

Now append following line to /etc/postgresql/12/main/pg_hba.conf file

sudo nano /etc/postgresql/12/main/pg_hba.conf

# Replace 10.10.10.77 with slave master's private IP


host replication replication 10.10.10.77/24 md5
For the next step, we need to remove the slave data directory. This task requires root privileges, so
ensure you switch to the root user before proceeding

cd /var/lib/postgresql/12/main/ sudo rm -rfv *


Now, we'll synchronize the slave database with the master database by executing the following
command. This will transfer all the data from the master to the slave
sudo su postgres cd /var/lib/postgresql/12/main/ pg_basebackup -h 10.10.10.77 -U
replication -p 5432 -D /var/lib/postgresql/12/main/ -Fp -Xs -P -R

As mentioned earlier, remember the password set for the replication user we created. In the following
step, you will be prompted to enter the password for this replication use

Once the fetching process is complete, proceed to start the PostgreSQL service

systemct start postgresql

Congratulations, you have successfully replicated your database! You can verify this by making any
change in the master database and observing that it gets immediately replicated in the slave database

test master slave configuration

first will create database called Test DB;


then I will create table, insert random values then we will check the slave if the data is replication
To proceed with your plan:

1. Create Database TestDB :

First, create a database named TestDB . Use the following SQL command:

CREATE DATABASE TestDB;

2. Create a Table and Insert Random Values:

After creating TestDB , switch to this database:

\c TestDB

Then, create a table. Let's say you create a table named example_table :
CREATE TABLE example_table (
id SERIAL PRIMARY KEY,
data VARCHAR(100)
);

Next, insert some random values into example_table :

INSERT INTO example_table (data) VALUES ('RandomValue1');


INSERT INTO example_table (data) VALUES ('RandomValue2');
INSERT INTO example_table (data) VALUES ('RandomValue3');

3. Check Replication on the Slave:

Finally, on the slave server, check if the data has been replicated. You can do this by querying
the same table on the slave server:

SELECT * FROM example_table;


If the slave is properly replicating data from the master, you should see the same rows that you inserted
on the master.

monitoring replication

We can verify the replication status by using the following command. If the state displays 'streaming', it
indicates that everything is functioning correctly

SELECT * FROM pg_stat_replication;


11.1 failover stream replication

verify the replication on slave server

stop the primary server

promote the standby to read-write

edit the pg_hba.conf

Create standby.signal on Old Primary

check if old master id read only

failback

master failback config part

promote the slave (old-master)


create single file on master server (old slave )

start the master (old slave ) services

verify the replication on slave server

before we start the failover , we need to verified the syn between the master and slave

in the slave run the below command and if its return 0 means no delay

SELECT CASE WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn() THEN 0


ELSE EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS
log_delay;

stop the primary server

once you confirmed everything is fine , you need to stop the services either by shutdown the server or
stop PostgreSQL services
for this purpose i will stop the PostgreSQL services .
systemctl stop postgresql
promote the standby to read-write

next step is to promote the standby server to be able to read and write
using the following command

psql -c "SELECT pg_promote();"


#This command promotes the standby to a read-write primary.

no if you try to create database it will allow since we enable read-write

edit the pg_hba.conf

edit the pg_hba.conf in slave server and add the ip of the master server to act as slave
cd /etc/postgresql/12/main echo "host replication replication 10.10.10.80/24 md5" >>
pg_hba.conf
Create standby.signal on Old Primary

Create the standby.signal file on the old primary to ensure it starts as a standby when brought back
online:

touch /etc/postgresql/12/main/standby.signal

also edit the below postgresql.auto.conf on master server


with the below paramter

#Update the postgres.auto.conf file with NEW MASTER SERVER DETAILS


vi /etc/postgresql/12/main/postgresql.auto.conf

#Modify the primary_conninfo parameter to reflect the new standby IP:

primary_conninfo = 'user=replication password=123456789 host=10.10.10.81


port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer
krbsrvname=postgres target_session_attrs=any'

after that start the PostgreSQL services on master server

systemctl start postgresql


check if old master id read only

run the below command if its return T means the database is in read only if F means the database in
read write

SELECT pg_is_in_recovery();
This query will return a single boolean value:
true if the server is a standby and false if it's the primary

failback

fist verify that database is syncing by running the below command on slave server and should be 0

SELECT CASE WHEN pg_last_wal_receive_lsn() = pg_last_wal_replay_lsn() THEN 0


ELSE EXTRACT(EPOCH FROM now() - pg_last_xact_replay_timestamp()) END AS
log_delay;

master failback config part

start by stopping the services on master server

systemctl stop postgresql

make sure the below parameter is added in master pg_hba.conf


host replication replicator 10.10.10.80/24 md5"

cd /etc/postgresql/12/main/
nano pg_hba.conf
promote the slave (old-master)

go to the slave server (old master ) and make it read-write ,


using the below command

psql -c "SELECT pg_promote();"

create single file on master server (old slave )

create the singe file on salve (old master)

touch /var/lib/postgresql/12/main/standby.signal

Edit postgresql.auto.conf on Old Primary

cd /etc/postgresql/12/main
nano postgresql.auto.conf

add the below line ensure to mention the slave (old-master) ip

primary_conninfo = 'user=replication password=123456789 host=10.10.10.80


port=5432 sslmode=prefer sslcompression=0 gssencmode=prefer
krbsrvname=postgres target_session_attrs=any'

start the master (old slave ) services

start the the services using the below command

systemctl start postgresql


check the log and observer if the server started as standby
`
cd /var/log/postgresql/
nano postgresql-12-main.log

`
12. logical replication

deterrent than stream replication , where the master will send wall log , there we send the actual
command such (insert into t1 values (1,'''value'' ))

Logical Replication

Source Database

Logical Changes Data Transformation

Replica Database

Physical Replication

Source Database

Physical Changes Data Copy

Replica Database

physical replication

in physical replication , the replica is force to copy the whole database schema tables and so on from
master server .
physical replication in this case can not support in replicating singe table

logical replication

as mentioned only replicated the changes , which give the advantage of replicating singe table .
here the primary database will send the DML command to be replayed on standby server
here's the scoop:

1. Decoding WAL Records: In PostgreSQL, Write-Ahead Logging (WAL) records all changes made to
the database. When using logical replication, the first step is to decode these WAL records. Think of
it like decrypting a secret code; this process extracts the actual changes that were made to the data.
2. Streaming to Replica Server: Once those changes are decoded, they're streamed over to the
replica server. This is like sending a live feed of the changes happening in the source database to
the replica, ensuring it stays up to date with the latest data modifications.

3. Applying Statements on Replica: On the replica server, these decoded changes are then applied
as SQL statements. It's like having a copycat follow along with the source database's actions,
executing the same SQL commands to mimic the changes.

So, in a nutshell, PostgreSQL goes through this process of decoding, streaming, and applying changes
to keep the replica database in sync with the source. It's like a well-choreographed dance of data
replication!

Source Database

Source Database

Writes Data

Decoding WAL Records

Decoded Changes

Streaming Changes

Live Stream

Replica Database

Replica Database

Applies Changes

Applying SQL Statements

this whole setup the primary server is called publisher server , and replica is called subscriber server .
similar to MS SQL replicationudemy

physical replication

1. must have the both server must have identical configuration


2. the data has to be placed on file system to be similar to both master and slave
3. these type of replication wont work to migrate from older version to newer version .
sudo update-alternatives --install /usr/bin/initdb initdb /usr/lib/postgresql/12/bin/initdb 1

logical replication limitations

1- doesn't support DDL command corresponded to creating index

logical replication setup

sequence of steps:

1- instantiate 2 PostgreSQL database cluster


2. configure the publisher with ''wal_level =logical"
3. start the instances
4. create a database and the tables

You might also like