0% found this document useful (0 votes)
29 views13 pages

Howto HA Zimbra8

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views13 pages

Howto HA Zimbra8

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

This how-to explains the HA installation of Zimbra 8 on an Ubuntu server 12.

04 LTS x64 and was


made for the Zimbra community and particularly to add elements to this thread:
http://www.zimbra.com/forums/administrators/58113-zimbra-pacemaker-drbd-howto.html.

This document is a compilation of information found on fora, blogs, official docs, and the fruits of my
experiments, so I don’t claim any rights: you’re free to reproduce / modify / use it!

In this documentation, the HA will be made between two VM named fangorn (preferred master) and
sylvebarbe (slave).
Two different storage technics were tried: DRBD, and virtual bus sharing by VMware on a SAN.

Prior to the operations this document gives, the following packages were installed on the VM:
- Pacemaker (http://packages.ubuntu.com/precise/pacemaker), our cluster manager.
- Corosync (http://packages.ubuntu.com/precise/corosync), our cluster engine.
- Openssh-server (http://packages.ubuntu.com/precise/openssh-server).
- Likewise-open (http://packages.ubuntu.com/precise/likewise-open).
- Sqlite3 (http://packages.ubuntu.com/precise/sqlite3).
- Sysstat (http://packages.ubuntu.com/precise/sysstat).
- Libgmp3c2 (http://packages.ubuntu.com/precise/libgmp3c2).
- Drbd8-utils (http://packages.ubuntu.com/precise/drbd8-utils).
- Zimbra Collaboration Suite 8 Open Source Edition (http://www.zimbra.com/downloads/os-
downloads.html).

Finally, I tried to keep it clear and simple, but my English may be not very good, so I apologize for any
misunderstanding you might meet with.

Taer.

Table of contents

A. Configure DRBD ........................................................................................................................... 2


B. Install and basic configuration of Zimbra .................................................................................... 3
C. Configure the cluster ................................................................................................................... 3
1. Corosync .................................................................................................................................. 3
2. Pacemaker ............................................................................................................................... 4
3. Virtual IP .................................................................................................................................. 6
4. DRBD ........................................................................................................................................ 7
5. Clustered Virtual Drive ............................................................................................................ 9
6. Automatic launch of Zimbra on the nodes ............................................................................ 10
D. Tests........................................................................................................................................... 11
E. Help ........................................................................................................................................... 13
1. Split Brain recovery ............................................................................................................... 13
2. Forcing reconnection to ms-zimbra resource ....................................................................... 13

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 1


A. Configure DRBD

Edit /etc/drbd.conf on the two machines:

# You can find an example in /usr/share/doc/drbd.../drbd.conf.example

include "drbd.d/global_common.conf";
include "drbd.d/*.res";

resource zimbra {
syncer {
rate 30M; #Increases the sync speed (should be 30% of the
max speed of the chaîne "I/O + network") /!\This is the DRBD 8.3 Syntax,
please check your version
}
meta-disk internal;
on sylvebarbe {
device /dev/drbd0; #Your DRBD device
disk /dev/sda2; #Your “real” partition
address 192.168.6.11:7788; #Your VM IP and port for DRBD
communication for this resource
}
on fangorn {
device /dev/drbd0;
disk /dev/sda2;
address 192.168.6.10:7788;
}
}

Remove any reference to those partitions in /etc/fstab.

Create the resources:

root@fangorn ~# drbdadm create-md zimbra

root@sylvebarbe ~# drbdadm create-md zimbra

If an error appears, the simplest is to recreate the partition (/!\ Beware : it should not be formatted) :

fdisk /dev/sda2

Start DRBD on the two nodes:

service drbd start

Define a primary node (with synchronization):

root@fangorn ~# drbdsetup /dev/drbd0 primary –o

With the command “/etc/init.d/drbd status” you can visualize the progression of the
synchronization.

One can now create a file system:

mkfs.ext4 /dev/drbd0

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 2


Then, mount it (with “mount /dev/drbd0 /opt/zimbra”).

If one of the disks is no more available, DRBD keeps a local copy of the modifications. So, when the
slave reconnects, the master sends to it all the missing updates. Therefore, the slave could continue
data writing as if nothing has happened.

B. Install and basic configuration of Zimbra

Install Zimbra on all VM of the Cluster, by mounting “/dev/drbd0” in “/opt/zimbra”.


It might be possible to not doing a full install on the second VM just by copying the right files, but
found it easier for only 2 machines to just install twice.

If Zimbra is already installed, errors could happened (port conflicts, LDAP errors,…), in this case,
uninstall Zimbra before by following the official doc
(http://wiki.zimbra.com/wiki/UnInstalling_Zimbra_on_Linux).

To install Zimbra from an archive (available here: http://www.zimbra.com/downloads/os-


downloads.html):

# tar –zxcf zcs-X.X.X_GA….tgz


# cd zcs-X.X.X_GA…
# ./install.sh

Make the install that correspond to your needs, but be sure in the config to change the following
parameter:

1 > 1 > localhost (or whatever you want as long as it is the same on the two machines and appears in
each /etc/hosts)

C. Configure the cluster

We will configure Corosync for the VM to watch each other, and Pacemaker for the active VM to take
the virtual IP 192.168.100.40, and mount all the partitions needed for Zimbra.

1. Corosync

Edit the config file of Corosync : /etc/corosync/corosync.conf. This file should be the same on all the
nodes (all the VM of the cluster). The idea is to increase some timers to have less risks of having a
false out of order node case:

• token: 5000
• token_retransmits_before_loss_const: 20
• join: 1000
• consensus: 7500
• max_messages: 20
• secauth: on #on = using an auth key to allow a node to connect to the cluster

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 3


Now, we should declare a network interface to allow communication between the nodes.

You will need to replace/add:

• Your network address in bindnetaddr. Example: your eth0 got the address 192.168.6.11 with
a netmask of 24, put 192.168.6.0.
• You multicast address and port (or you can keep the default one).
• To add a second interface, add another “interface” bloc and increase the “ringnumber”, put
the address of your second network and a new multicast. You will have to change
“rrp_mode” to active: your two interfaces will work simultaneously. If you change
“rrp_mode” to passive, the second interface will be activated only if the first is broken. With
“rrp_mode” to none, only the first interface will ever be used.

Example:

rrp_mode: none

interface {
ringnumber: 0
bindnetaddr: 192.168.6.0
mcastaddr: 226.94.1.1
mcastport: 5405
}

Authentication

When the package was installed, a key was generated to authenticate your node to the cluster. You
can recreate it:

# corosync-keygen

Then, you have to send the key on the different nodes of the cluster:

scp /etc/corosync/authkey root@my_other_node:/etc/corosync

The configuration of the cluster is done. Activate the starting of corosync at boot by modifying the
file /etc/default/corosync. You can start it manually by: /etc/init.d/corosync start

2. Pacemaker

You now have to add the Pacemaker service in your cluster. Add the following bloc in your Corosync
config file if it isn’t already there:

service {
# Load the Pacemaker Cluster Resource Manager
ver: 0
name: pacemaker
}

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 4


If you type the command “crm_mon -1”, you should see something like this:

============
Last updated: Fri Oct 4 11:29:26 2012
Stack: openais
Current DC: sylvebarbe - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
0 Resources configured.
============

Online: [ sylvebarbe fangorn ]

With that command you will be able to see the state of your cluster (nodes state, resources state…).

Now, we will connect to the Pacemaker CLI (on only one of the nodes) with the command crm:

root@fangorn:~# crm
crm(live)# help

This is the CRM command line interface program.

Available commands:

cib manage shadow CIBs


resource resources management
node nodes management
options user preferences
configure CRM cluster configuration
ra resource agents information center
status show cluster status
quit,bye,exit exit the program
help show help
end,cd,up go back one level

crm(live)# configure
crm(live)configure# show
node fangorn
node sylvebarbe

Then, to deactivate stonith and ignore the loss of the quorum:

crm(live)configure# property stonith-enabled="false" no-quorum-


policy="ignore"

• Check the config with show.


• Push the modifications on all the nodes with the command commit.

To sum up:

crm(live)configure# show
node fangorn
node sylvebarbe
property $id="cib-bootstrap-options" \
stonith-enabled="false" \
no-quorum-policy="ignore"

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 5


crm(live)configure# commit
WARNING: CIB changed in the meantime: won't touch it!
Do you still want to commit? y

You could ignore this WARNING.


You cluster is now ready to manage resources (virtual IP, file systems…).

3. Virtual IP

The VIP is the service address of your cluster; this means it is the address you will use to access to
your service in HA. This IP will be added as a secondary address of your network interface

Always in crm configure, inject the following configuration and validate with commit:

primitive VIP1 ocf:heartbeat:IPaddr2 \


params ip="192.168.100.40" broadcast="192.168.100.255" nic="eth0"
cidr_netmask="24" iflabel="VIP1" \
op monitor interval="30s" timeout="30s"

A “crm_mon -1” now gives:

============
Last updated: Mon Oct 4 13:34:42 2012
Stack: openais
Current DC: sylvebarbe - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
1 Resources configured.
============

Online: [ sylvebarbe fangorn ]

VIP1 (ocf::heartbeat:IPaddr2): Started sylvebarbe

Translation: I have a resource “VIP1” on my cluster, started on the node sylvebarbe.

If you look at ifconfig:

eth0:VIP1 Link encap:Ethernet HWaddr 00:01:02:03:04:05


inet addr:192.168.100.40 Bcast:192.168.100.255
Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 6


Ok, now a little schematization of what our cluster is looking like:

4. DRBD

To configure this service, always in crm configure, inject the following config, and then validate
with commit:

Definition of the DRBD resource:

primitive zimbra ocf:heartbeat:drbd \


params drbd_resource="zimbra" \
op monitor role="Master" interval="59s" timeout="30s"
op monitor role="Slave" interval="60s" timeout="30s"

Role parameters:

ms ms-zimbra zimbra \
meta clone-max="2" notify="true" globally-unique="false" target-
role="Started"

Definition of the mounted file system resource:

primitive fs-zimbra ocf:heartbeat:Filesystem \


params device="/dev/drbd0" directory="/opt/zimbra" fstype="ext4"

Creation of a group containing the VIP resource, and the FS resource, this group is denied in
colocation on ms-zimbra (our main resource).

group zimbra-group VIP1 fs-zimbra

colocation zimbra-group-on-ms-zimbra inf: zimbra-group ms-zimbra:Master

Definition of the starting order of resources, on a unique Master (Fangorn if possible):

order ms-zimbra-before-zimbra-group inf: ms-zimbra:promote zimbra-


group:start

location ms-zimbra-master-on-fangorn ms-zimbra rule role=master 100: #uname


eq fangorn

“crm_mon -1” output:

============

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 7


Last updated: Tue Oct 23 11:56:57 2012
Last change: Tue Oct 23 11:48:43 2012 via cibadmin on fangorn
Stack: openais
Current DC: fangorn - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
4 Resources configured.
============

Online: [ sylvebarbe fangorn ]

Master/Slave Set: ms-zimbra [zimbra]


Masters: [ fangorn ]
Slaves: [ sylvebarbe ]
Resource Group: zimbra-group
VIP1 (ocf::heartbeat:IPaddr2): Started fangorn
fs-zimbra (ocf::heartbeat:Filesystem): Started fangorn

To visualize DRBD output:

# /etc/init.d/drbd status

drbd driver loaded OK; device status:


version: 8.3.11 (api:88/proto:86-96)
srcversion: 71955441799F513ACA6DA60
m:res cs ro ds p mounted
fstype
0:zimbra Connected Primary/Secondary UpToDate/UpToDate C /opt/zimbra
ext4

When I have done this configuration, I had some resources errors, in this case make sure you forgot
nothing (like mounting directory) and reboot your machine.
The “pengine” will prevent a resource known to make errors to start again, that’s why rebooting is
important to reset this behavior.

Schematization:

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 8


5. Clustered Virtual Drive

Since we will use a SAN, this time we will mount a file system controlled without DRBD.
This one will serve to stock emails and index of the users’ mailbox, and will be mounted in
“/mail/data” on our servers.

To do so:

• Connect to VMCENTER.
• Open properties of a VM.
• Add a virtual disk which takes in charge clustering functions.
• Put this disk on a new SCSI controller with the sharing bus mode on “Virtual”.

Now, on the other(s) VM:

• Connect to VMCENTER.
• Open properties of the VM.
• Add our previous virtual disk.
• Put this disk on a new SCSI controller with the sharing bus mode on “Virtual”.

Note : a virtual bus sharing permits to link several VM to the same disk, but forbids the use of
snapshots.

After a restart of our VM, this new disk is ready, let’s format it!

To know its logical name:

lshw –C disk

Creation of a partition:

fdisk /dev/sdb

Creation of the file system:

mkfs.ext4 /dev/sdb1

Make on each VM a directory to shelter our partition:

mkdir -p /mail/data
chown zimbra:zimbra /mail/data
chmod 775 /mail/data

As the zimbra user:

su zimbra
mkdir /mail/data/index
mkdir /mail/data/store

Index will contain the index files (by default /opt/zimbra/index), and store the mail data (by default
/opt/zimbra/store).

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 9


Then, in crm configure, inject the following configuration:

primitive fs-zmdata ocf:heartbeat:Filesystem \


params device="/dev/sdb1" directory="/mail/data" fstype="ext4"

Therefore, edit with the “edit” command the config file to add our new resource to the group,
before validating with “commit”:

group zimbra-group VIP1 fs-zimbra fs-zmdata

The file system is now automatically mounted on the master of the cluster.

6. Automatic launch of Zimbra on the nodes

Adrián Gibanel from bTactic (http://www.btactic.com) realized an OCF script doing the automatic
launch of Zimbra, this one was designed for Ubuntu 10.04 + Zimbra 7
(http://www.btactic.org/2012/08/26/zimbra-7-ose-ha-alta-disponibilidad-pacemaker-corosync-drbd-
ubuntu-10-04/), but seems to work fine with Ubuntu 12.04 + Zimbra 8.

On each VM of the cluster, do the following:

mkdir /usr/lib/ocf/resource.d/btactic

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 10


Put the bTactic Zimbra OCF script in this folder, then:

chmod +x /usr/lib/ocf/resource.d/btactic/*

On one of the VMs enter crm configure, and inject the config:

primitive ZMServer ocf:btactic:zimbra \


op monitor interval="120s" timeout="40s" \
op start interval="0" timeout="360s" \
op stop interval="0" timeout="360s"

Then, edit with the edit command the config file to add our new resource to the group, before
validating with commit:

group zimbra-group VIP1 fs-zimbra fs-zmdata ZMServer

Finally, this is the output of a crm status:

============
Last updated: Wed Oct 24 15:36:27 2012
Last change: Wed Oct 24 13:54:08 2012 via cibadmin on fangorn
Stack: openais
Current DC: sylvebarbe - partition with quorum
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
2 Nodes configured, 2 expected votes
6 Resources configured.
============

Online: [ sylvebarbe fangorn ]

Master/Slave Set: ms-zimbra [zimbra]


Masters: [ fangorn ]
Slaves: [ sylvebarbe ]
Resource Group: zimbra-group
VIP1 (ocf::heartbeat:IPaddr2): Started fangorn
fs-zimbra (ocf::heartbeat:Filesystem): Started fangorn
fs-zmdata (ocf::heartbeat:Filesystem): Started fangorn
ZMServer (ocf::btactic:zimbra): Started fangorn

Zimbra is now started automatically on the master of the cluster. But Zimbra services could take
some time to start: between 1min30 and 2mins, so don’t panic if ZMServer says “Stopped” for a
while.

D. Tests

For my tests I used a script adding 100 mailbox, and sending a mail to the admin once each mailbox is
created. That will allow testing the two methods (DRBD with users’ creation and virtual bus sharing
with mail sending).
All tests were made twice to confirm the results.

First test: network loss on Fangorn during script execution

Fangorn still considers itself as the master, and for him, Sylvebarbe is lost.

Sylvebarbe sees the loss of Fangorn, and is (himself-) promoted Master.

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 11


None of the new accounts created on Fangorn appear on Sylvebarbe, but mails sent to admin do.

When Fangorn’s network is up again:


- For several seconds there is two masters, and a split brain is detected.
- Fangorn is taking back the commands, and Sylvebarbe is the slave again.

Mails and Index keeps the previous state on Fangorn (all modifications made by Sylvebarbe are lost),
plus all the mails that have been queued on Fangorn. And this, even if Fangorn was shut down after
networking loss.

Note: if I restart Fangorn after network loss, the Zimbra resource owned by the bTactic script take the
state “Stopped” and after a few seconds “Started Sylvebarbe” again. This is quite strange since the
only modification in crm status is the appearance of Fangorn as slave.

All the modifications to the database made by the script on Fangorn are on its side of /dev/drbd0, so
a resync of DRBD is enough to keep or discard the database modification, depending on which you
choose as the primary and secondary node.
Fangorn is automatically promoted Master after some time.

To sum up, in case of Split Brain:


- No data corruption detected.
- DRBD keeps the modifications and you have the possibility to discard them.
- The virtual bus sharing mounted FS lost any modifications made on the promoted master.

Second test: Fangorn power off during script execution

Sylvebarbe sees that Fangorn is missing, and is promoted master.

Users created on Fangorn don’t appear.


Mails are in the admin box, but of course it lacks some since the script was interrupted.

When Fangorn is starting up again, DRBD is automatically re-synching, so the Sylvebarbe’s version is
kept.

To sum up, in case of Master loss:


- No data corruption detected.
- DRBD doesn’t keep the modifications.
- The virtual bus sharing mounted FS keeps all the modifications.

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 12


E. Help

1. Split Brain recovery

DRBD detects split brain when the connection between the nodes is up again, and do their DRBD
protocol handshake. If DRBD detects that the two nodes are (or were at disconnection) in primary
role, the replication is cut. The following message appears in system log:

Split-Brain detected, dropping connection!

After split brain detection, one node will always be in StandAlone mode. The other must be too in
StandAlone mode (if the two nodes detected simultaneously the split brain), or in WFConnection (if
connection was down before the other node had a chance to detect split brain).

At this moment, if DRBD does not recovery by itself, a manual intervention is needed on one node
which the modifications will be lost (split brain victim):

drbdadm secondary resource


drbdadm -- --discard-my-data connect resource

Where resource is zimbra for /dev/drbd0 in our example.

On the other node (split brain survivor), if it is in StandAlone mode:

drbdadm connect resource

If the node is in WFConnection mode, it will reconnect automatically.

When reconnected, the split brain victim changes to SyncTarget mode, and its data are replaced by
the one of the split brain survivor.

After re-synching, split brain is resolved and the two nodes are UpToDate.

2. Forcing reconnection to ms-zimbra resource

If the master node is down, the slave node is promoted master.


But in some cases, once the old master is up again, getting back to the initial state could be long or
not automatic, so there is a command to force the reconnection of the old master to the ms-zimbra
resource:

crm resource start ms-zimbra

How-to HA with Zimbra 8 OSE – 12/11/2012 Page 13

You might also like