Deploying PostgreSQL Clusters Using Kubernetes StatefulSets
Deploying PostgreSQL Clusters Using Kubernetes StatefulSets
Kubernetes StatefulSets
Jeff McCormick
Jeff McCormick
Feb 26, 2017·6 min read
This blog provides guidance on how to build a PostgreSQL cluster using the new Kubernetes
feature - StatefulSet. Using this StatefulSet capability provides a very simple, Kubernetes native,
mechanism to make clustering decisions when deploying a PostgreSQL cluster.
The Crunchy PostgreSQL Container Suite is a set of containers that can be used to deploy,
monitor, and administer the open source PostgreSQL database. More details can be found in the
crunchy-containers GitHUB repository here In a prior blog, Crunchy Data described how to
deploy a similar cluster using Helm.
StatefulSets Example
Step 1 - Create Kube Environment
StatefulSets is a new feature and, as a result, running this example will require an environment
based on Kubernetes 1.5.
The example in this blog deploys on Centos7 using kubeadm. Some instructions on
what kubeadm provides and how to deploy a Kubernetes cluster is located here.
The example script assumes the NFS server is running locally and the hostname resolves to a
known IP address.
In summary, the steps used to get NFS working on a Centos 7 host are as follows:
sudo vi /etc/exports
sudo exportfs -r
The /etc/exports file should contain a line similar to this one except with the applicable IP
address specified:
cd $HOME
git clone https://github.com/CrunchyData/crunchy-containers.git
cd crunchy-containers/examples/kube/statefulset
export CCP_IMAGE_TAG=centos7-9.5-1.2.6
BUILDBASE is where you cloned the repository and CCP_IMAGE_TAG is the container image
version we want to use.
./run.sh
Immediately after the pods are created, the deployment will be as depicted below:
So, how do the containers determine who will be the master, and who will be the replica?
This is where the new StateSet mechanics come into play. The StateSet mechanics assign a
unique ordinal value to each pod in the set.
The StatefulSets provided unique ordinal value always start with 0. During the initialization of
the container, each container examines its assigned ordinal value. An ordinal value of 0 causes
the container to assume the master role within the database cluster. For all other ordinal values,
the container assumes a replica role. This is a very simple form of discovery made possible by
the StatefulSet mechanics.
Replicas are configured to connect to the master database via a Service dedicated to the master
database. In order to support this replication, the example creates a separate Service for each of
the master role and the replica role. Once the replica has connected, the replica will begin
replicating state from the master.
During the container initialization, a master container will use a Service Account (pgset-sa) to
change it’s container label value to match the master Service selector. Changing the label is
important to enable traffic destined to the master database to reach the correct container within
the Stateful Set. All other pods in the set assume the replica Service label by default.
The crunchy-postgres container supports other forms of cluster deployment, the style of
deployment is dictated by setting the PG_MODE environment variable for the container. In the
case of a StatefulSet deployment, that value is set to:
PG_MODE= set
This environment variable is a hint to the container initialization logic as to the style of
deployment intended.
In addition, the tests below assume that the tested environment DNS resolves to the Kube DNS
and that the tested environment DNS search path is specified to match the applicable Kube
namespace and domain. The master service is named pgset-master and the replica service is
named pgset-replica.
Test the master as follows (the password is password):
If things are working, the command above will return output indicating that a single replica is
connecting to the master.
The command above should fail as the replica is read-only within the cluster.
The command above should successfully create a new replica pod called pgset-2 as depicted
below:
Step 8 - Persistence Explained
Take a look at the persisted data files on the resulting NFS mount path:
ls -l /nfsfileshare/
total 12
Each container in the StatefulSet binds to the single NFS Persistent Volume Claim (pgset-pvc)
created in the example script.
Since NFS and the PVC can be shared, each pod can write to this NFS path.
The container is designed to create a subdirectory on that path using the pod host name for
uniqueness.
Conclusion
StatefulSets is an exciting feature added to Kubernetes for container builders that are
implementing clustering. The ordinal values assigned to the set provide a very simple mechanism
to make clustering decisions when deploying a PostgreSQL cluster.
WRITTEN BY
Jeff McCormick