Mastering OpenStack - Sample Chapter
Mastering OpenStack - Sample Chapter
P U B L I S H I N G
E x p e r i e n c e
D i s t i l l e d
Mastering OpenStack
$ 49.99 US
31.99 UK
Sa
m
pl
C o m m u n i t y
Omar Khedher
Mastering OpenStack
Mastering OpenStack
ee
Omar Khedher
Preface
Since its first official release in 2010, OpenStack has distinguished itself as the
ultimate open source cloud operating system. Today, more than 200 companies
worldwide have joined the development of the OpenStack project, which makes
it an attractive cloud computing solution for thousands of organizations. The
main reason behind the success of OpenStack is not the overwhelming number
of features that it has implemented, but rather its good modularity. Thanks to its
vast community around the world, OpenStack is growing very fast. Each release
exposes new modules and administrative facilities that offer on-demand computing
resources by provisioning a large set of networks of virtual machines. If you are
looking for a cloud computing solution that scales out well, OpenStack is an ideal
option. Nowadays, it is considered to be a mature cloud computing operating
system. Several big, medium, and small enterprises have adopted it as a solution in
their infrastructure. The nirvana of OpenStack comes from its architecture. Designing
your cloud becomes much easier with more flexibility. It is an ideal solution if you
intend either to design a start up cloud environment or to integrate it into your
existing infrastructure. As you build your cloud using OpenStack, you will be able
to integrate with legacy systems and third-party technologies by eliminating vendor
lock-in as much as possible.
This book is designed to discuss what is new in OpenStack with regards to the new
features and incubated projects. You will be guided through this book from design
to deployment and implementation with the help of a set of best practices in every
phase. Each topic is elaborated so that you can see the big and complete picture of
a true production environment that runs OpenStack at scale. It will help you decide
upon the ways of deploying OpenStack by determining the best outfit for your
private cloud, such as the computer, storage, and network components.
Preface
If you are ready to start a real private cloud running OpenStack, master the
OpenStack design, and deploy and manage a scalable OpenStack infrastructure, this
book will prove to be a clear guide that exposes the latest features of the OpenStack
technology and helps you leverage its power to design and manage any medium or
large OpenStack infrastructure.
Preface
Chapter 4, Learning OpenStack Storage Deploying the Hybrid Storage Model, will
cover the subject of storage in OpenStack. The chapter will start by focusing on the
storage types and their use cases. You will learn about an object storage code named
Swift and how it works in OpenStack. A real Swift deployment will be shown to
help you calculate the hardware requirements. The chapter will also talk about
the block storage code named Cinder in OpenStack. You will learn how to decide
which storage type will fulfill your needs. It will also explore Ceph and its main
architectural design. It will help you integrate it and install in your test OpenStack
environment using Vagrant and Chef.
Chapter 5, Implementing OpenStack Networking and Security, will focus mainly on the
networking security features in OpenStack. It will cover the concept of namespaces
and security groups in OpenStack and how you can manage them using the Neutron
and Nova APIs. In addition, it will explore the new networking security feature,
Firewall as a Service. A case study will help you understand another networking
feature in Neutron called VPN as a Service.
Chapter 6, OpenStack HA and Failover, will cover the topics of high availability and
failover. For each component of the OpenStack infrastructure, this chapter will
expose several HA options. The chapter will be replete with HA concepts and best
practices, which will help you define the best HA OpenStack environment. It serves
as a good complementary chapter for the previous chapters by bringing a geared,
distributed, and fault-tolerant OpenStack architecture design. Numerous open
source solutions, such as HAProxy, Keepalived, Pacemaker, and Corosync, will be
discussed through a step-by-step instruction guide.
Chapter 7, OpenStack Multinode Deployment Bringing in Production, will be your
"first production day" guide. It will focus on how you can deploy a complete
multinode OpenStack setup. A sample setup will be explained and described in
detail by exposing the different nodes and their roles, the network topology, and
the deployment approach. The chapter will contain a practical guide to OpenStack
deployment using bare metal provision tools xCAT together with the Chef server. It
will demonstrate the first run of a new OpenStack tenant.
Chapter 8, Extending OpenStack Advanced Networking Features and Deploying Multitier Applications, will delve into the advanced OpenStack networking features. It will
explain in depth the Neutron plugins such as Linux Bridge and Open vSwitch, how
they differ from the architectural perspective, and how instances can be connected to
networks with the Neutron plugins. The chapter will also cover Load Balancing as a
Service, which is used to load balance the traffic between instances by exploring their
fundamental components. In addition, an orchestration module named Heat will be
introduced in this chapter and will be used to build a complete stack to show how a
real load balancer is deployed in OpenStack.
Preface
Chapter 9, Monitoring OpenStack Ceilometer and Zabbix, will explore another new
incubated project called Ceilometer as a new telemetry module for OpenStack.
The chapter will discuss briefly the architecture of Ceilometer and how you can
install and integrate it into the existing OpenStack environment. The discussion on
Heat will be resumed, and it will be used to expand a stack installation including
Ceilometer. The purpose of this is to discover the capabilities of heat with regard to
supporting the Ceilometer functions, such as alarms and notifications. This section
will also make sure that the OpenStack environment is well-monitored using some
external monitoring tools such as Zabbix for advanced triggering capabilities.
Chapter 10, Keeping Track for Logs Centralizing Logs with Logstash, will talk about
the problem of logging in OpenStack. The chapter will present a very sophisticated
logging solution called Logstash. It will go beyond the tailing and grepping of
single log lines to tackle complex log filtering. The chapter will provide instructions
on how to install Logstash and forward the OpenStack log files to a central
logging server. Furthermore, a few snippets will be be provided to demonstrate
the transformation of the OpenStack data logs and events into elegant graphs
that are easy to understand.
Chapter 11, Tuning OpenStack Performance Advanced Configuration, will wrap things
up by talking about how you can make the OpenStack infrastructure run better with
respect to its performance. Different topics, such as the advanced configuration in the
exiting OpenStack environment, will be discussed. The chapter will put under the
microscope the performance enhancement of MySQL by means of hardware upgrade
and software layering such as memcached. You will learn how to tune the OpenStack
infrastructure component-by-component using a new incubated OpenStack project
called Rally.
[ 171 ]
[ 172 ]
Chapter 6
Fallback: Once a primary is back after a failed event, the service can be
migrated back from the secondary
On the other side, we may find a different terminology, which you may have most
likely already experienced, that is, load balancing. In a heavily loaded environment,
load balancers are introduced to redistribute a bunch of requests to less loaded
servers. This can be similar to the high performance clustering concept, but you
should note that this cluster logic takes care of working on the same request, whereas
a load balancer aims to relatively distribute the load based on its task handler in an
optimal way.
HA levels in OpenStack
It might be important to understand the context of HA deployments in OpenStack.
This makes it imperative to distinguish the different levels of HA in order to consider
the following in the cloud environment:
L1: This includes physical hosts, network and storage devices, and
hypervisors
L3: This includes the virtual machines running on hosts that are managed by
OpenStack services
[ 173 ]
The main focus of the supporting HA in OpenStack has been on L1 and L2, which
are covered in this chapter. On the other hand, L3 HA has limited support in the
OpenStack community. By virtue of its multistorage backend support, OpenStack is
able to bring instances online in the case of host failure by means of live migration.
Nova also supports the Nova evacuate implementation, which fires up API calls for
VM evacuation to a different host due to a compute node failure. The Nova evacuate
command is still limited as it does not provide an automatic way of instance failover.
L2 and L3 HA are considered beyond the scope of this book. L4 HA is touched on,
and enhanced by, the community in the Havana release. Basically, a few incubated
projects in OpenStack, such as Heat, Savana, and Trove, have begun to cover
HA and monitoring gaps in the application level. Heat will be introduced in
Chapter 8, Extending OpenStack Advanced Networking Features and Deploying
Multi-tier Applications, while Savana and Trove are beyond the scope of this book.
Live migration is the ability to move running instances from one
host to another with, ideally, no service downtime. By default, live
migration in OpenStack requires a shared filesystem, such as a
Network File System (NFS). It also supports block live migration
when virtual disks can be copied over TCP without the need for a
shared filesystem. Read more on VM migration support within the
last OpenStack release at http://docs.openstack.org/adminguide-cloud/content/section_configuring-computemigrations.html.
[ 174 ]
Chapter 6
Availability means that not only is a service running, but it is also exposed and able to be
consumed. Let's see a small overview regarding the maximum downtime by looking
at the availability percentage or HA as X-nines:
Availability level
Availability
percentage
Downtime/year
Downtime/day
1 Nine
90
~ 36.5 days
~ 2.4 hours
2 Nines
99
~ 3.65 days
~ 14 minutes
3 Nines
99.9
~ 8.76 hours
~ 86 seconds
4 Nines
99.99
~ 52.6 minutes
~ 8.6 seconds
5 Nines
99.999
~ 5.25 minutes
~ 0.86 seconds
6 Nines
99.9999
~ 31.5 seconds
~ 0.0086 seconds
User satisfaction
No repeat incidents
A paradox may appear between the lines when we consider that eliminating the
SPOF in a given OpenStack environment will include the addition of more hardware
to join the cluster. At this point, you might be exposed to creating more SPOF and,
even worse, complicated infrastructure where maintenance turns into a difficult task.
Measuring HA
The following is a simple tip:
If you do not measure something, you cannot manage it. But what kind of metrics
can be measured in a highly available OpenStack infrastructure?
Agreed, HA techniques come across as increasing the availability of resources, but
still, there are always reasons you may face an interruption at some point! You may
notice that the previous table did not mention any value equal to 100 percent uptime.
[ 175 ]
First, you may appreciate the nonvendor lock-in hallmark that OpenStack offers on
this topic. Basically, you should mark the differences between HA functionalities that
exist in a virtual infrastructure. Several HA solutions provide protection to virtual
machines when there is a sudden failure in the host machine. Then, it will perform a
restore situation for the instance on a different host. What about the virtual machine
itself? Does it hang? So far, we have seen different levels of HA. In OpenStack, we
have already seen cloud controllers run manageable services and compute hosts,
which can be any hypervisor engine and third-rank the instance itself!
The last level might not be a cloud administrator task that maximizes its internal
services' availability as it belongs to the end user. However, what should be taken
into consideration, is what really affects the instance externally, such as:
Storage attachment
[ 176 ]
Chapter 6
The HA dictionary
To ease the following sections of this chapter, it might be necessary to remember few
terminologies to justify high availability and failover decisions later:
Stateless service: This is the service that does not require any record of the
previous request. Basically, each interaction request will be handled based on
the information that comes with it. In other words, there is no dependency
between requests where data, for example, does not need any replication.
If a request fails, it can be performed on a different server.
Stateless services
MySQL, RabbitMQ
[ 177 ]
Hands on HA
Chapter 1, Designing OpenStack Cloud Architecture, provided a few hints on how to
prepare for the first design steps: do not lock keys inside your car. At this point, we can
go further due to the emerging different topologies, and it is up to you to decide
what will fit best. The first question that may come into your mind: OpenStack
does not include native HA components; how you can include them? There are
widely used solutions for each component that we cited in the previous chapter
in a nutshell.
Understanding HAProxy
HAProxy stands for High Availability Proxy. It is a free load balancing software tool
that aims to proxy and direct requests to the most available nodes based on TCP/
HTTP traffic. This includes a load balancer feature that can be a frontend server. At
this point, we find two different servers within an HAProxy setup:
A backend server defines a different set of servers in the cluster receiving the
forwarded requests
Load balancing layer 7: The application layer will be used for load balancing.
This is a good way to load balance network traffic. Simply put, this mode
allows you to forward requests to different backend servers based on the
content of the request itself.
[ 178 ]
Chapter 6
Many load balancing algorithms are introduced within the HAProxy setup. This is
the job of the algorithm, which determines the server in the backend that should be
selected to acquire the load. Some of them are as follows:
Leastconn: The selection of the server is based on the lucky node that has the
lowest number of connections.
Source: This algorithm ensures that the request will be forwarded to the
same server based on a hash of the source IP as long as the server is still up.
Contrary to RR and leastconn, the source algorithm is considered
a static algorithm, which presumes that any change to the server's
weight on the fly does not have any effect on processing the load.
URI: This ensures that the request will be forwarded to the same server
based on its URI. It is ideal to increase the cache-hit rate in the case of proxy
caches' implementations.
Like the source, the URI algorithm is static in that updating the
server's weight on the fly will not have any effect on processing
the load.
You may wonder how the previous algorithms determine which servers in
OpenStack should be selected. Eventually, the hallmark of HAProxy is a healthy
check of the server's availability. HAProxy uses health check by automatically
disabling any backend server that is not listening on a particular IP address and port.
But how does HAProxy handle connections? To answer this question, you should
refer to the first logical design in Chapter 1, Designing OpenStack Cloud Architecture,
which is created with virtual IP (VIP). Let's refresh our memory about the things
that we can see there by treating a few use cases within a VIP.
[ 179 ]
[ 180 ]
Chapter 6
Keepalived is a free software tool that provides high availability and load
balancing facilities based on its framework in order to check a Linux Virtual
Server (LVS) pool state.
LVS is a highly available server built on a cluster of real servers by
running a load balancer on the Linux operating system. It is mostly
used to build scalable web, mail, and FTP services.
As shown in the previous illustration, nothing is magic! Keepalived uses the Virtual
Router Redundancy Protocol (VRRP) protocol to eliminate SPOF by making IPs
highly available. VRRP implements virtual routing between two or more servers in
a static, default routed environment. Considering a master router failure event, the
backup node takes the master state after a period of time.
In a standard VRRP setup, the backup node keeps listening for
multicast packets from the master node with a given priority. If the
backup node fails to receive any VRRP advertisement packets for a
certain period, it will take over the master state by assigning the routed
IP to itself. In a multibackup setup, the backup node with the same
priority will be selected within its highest IP value to be the master one.
[ 181 ]
HA the database
There's no doubt that behind any cluster, lies a story! Creating your database in the
HA mode in an OpenStack environment is not negotiable. We have set up MySQL in
cloud controller nodes that can also be installed on separate ones. Most importantly,
keep it safe not only from water, but also from fire. Many clustering techniques have
been proposed to make MySQL highly available. Some of the MySQL architectures
can be listed as follows:
[ 182 ]
Chapter 6
[ 183 ]
What you need are just Linux boxes. The DRBD works on their kernel layer
exactly at the bottom of the system I/O stack.
With shared storage devices, writing to multiple nodes
simultaneously requires a cluster-aware filesystem, such as
the Linux Global File System (GFS).
Chapter 6
Keep things simple; the main idea of CBR is to assume that the database can
roll back uncommitted changes, and it is called transactional in addition
to applying replicated events in the same order across all the instances.
Replication is truly parallel; each one has an ID check. What Galera can bring
as an added value to our OpenStack MySQL HA is the ease of scalability;
there are a few more things to it, such as joining a node to Galera while it is
automated in production. The end design brings an active-active
multimaster topology with less latency and transaction loss.
A very interesting point in the last illustration is that every MySQL node
in the OpenStack cluster should be patched within a Write-Set Replication
(wsrep) API. If you already have a MySQL master-master actively working,
you will need to install wsrep and configure your cluster.
Wsrep is a project that aims to develop a generic replication plugin
interface for databases. Galera is one of the projects that use wsrep
APIs by working on its wsrep replication library calls.
[ 185 ]
HA in the queue
RabbitMQ is mainly responsible for communication between different
OpenStack services. The question is fairly simple: no queue, no OpenStack service
intercommunication. Now that you get the point, another critical service needs to
be available and survive the failures. RabbitMQ is mature enough to support its
own cluster setup without the need to go for Pacemaker or another clustering
software solution.
The amazing part about using RabbitMQ is the different ways by which such a
messaging system can reach scalability using an active/active design with:
RabbitMQ clustering: Any data or state needed for the RabbitMQ broker to
be operational is replicated across all nodes.
Like any standard cluster setup, the original node handling the queue can be
thought of as a master, while the mirrored queues in different nodes are purely
slave copies. The failure of the master will result in the selection of the oldest
slave to be the new master.
[ 186 ]
Chapter 6
Implementing HA on MySQL
In this implementation, we will need three separate MySQL nodes and two HAProxy
servers, so we can guarantee that our load balancer will fail over if one of them fails.
Keepalived will be installed in each HAProxy to control VIP. Different nodes in this
setup will be assigned as the following:
VIP: 192.168.47.47
HAProxy01: 192.168.47.120
HAProxy02: 192.168.47.121
MySQL01: 192.168.47.125
MySQL02: 192.168.47.126
MySQL03: 192.168.47.127
3. Let's configure our first HAProxy node. We start by backing up the default
configuration file:
packtpub@haproxy1$ sudo cp /etc/haproxy/haproxy.cfg \ /etc/
haproxy/haproxy.cfg.bak
packtpub@haproxy1$ sudo nano /etc/haproxy/haproxy.cfg
global
log
127.0.0.1 local2
chroot
/var/lib/haproxy
pidfile
/var/run/haproxy.pid
[ 187 ]
1020
user
haproxy
group
haproxy
daemon
stats socket /var/lib/haproxy/stats.sock mode 600 level admin
stats timeout 2m
defaults
mode
tcp
log
global
option
dontlognull
option
redispatch
retries
timeout queue
45s
timeout connect
5s
timeout client
1m
timeout server
1m
timeout check
10s
maxconn
1020
tcp
stats
enable
stats
show-legends
stats
refresh
5s
stats
uri
stats
realm
Haproxy\ Statistics
stats
auth
monitor:packadmin
stats
admin
if TRUE
frontend haproxy1
bind
*:3306
default_backend
mysql-os-cluster
[ 188 ]
Chapter 6
backend mysql-os-cluster
balance roundrobin
server
mysql01
server
mysql02
server
mysql03
To bind a virtual address that does not exist physically on the server, you can
add the following option to sysctl.conf in your CentOS box:
net.ipv4.ip_nonlocal_bind=1
"killall -0 haproxy"
interval 2
weight
vrrp_instance MYSQL_VIP {
interface
eth0
virtual_router_id 120
priority
111
advert_int
[ 189 ]
track_script {
chk_haproxy
}
}
7. Repeat step 6 by replacing the priority to 110, for example, in the HAProxy2
node.
8. Check whether the VIP was assigned to eth0 in both the nodes:
packtpub@haproxy1$ ip addr show eth0
packtpub@haproxy2$ ip addr show eth0
9. Now you have HAProxy and Keepalived ready and configured; all we need
to do is set up the Galera plugin through all the MySQL nodes in the cluster:
packtpub@db01$ wget https://launchpad.net/codershipmysql/5.6/5.6.16-25.5/+download/MySQL-server-5.6.16_wsrep_25.5-1.
rhel6.x86_64.rpm
packtpub@db01$ wget https://launchpad.net/galera/0.8/0.8.0/ \
+download/galera-0.8.0-x86_64.rpm
packtpub@db01$ sudo
1.rhel6.x86_64.rpm
If you did not install MySQL within Galera from scratch, you
should stop the mysql service first before proceeding with the
Galera plugin installation. The example assumes that MySQL
is installed and stopped. More information about the usage
of Galera in OpenStack can be found here: http://docs.
openstack.org/high-availability-guide/content/
ha-aa-db-mysql-galera.html.
[ 190 ]
Chapter 6
11. Once the Galera plugin is installed, log in to your MySQL nodes and create
a new galera user with the galerapass password and, optionally, the
haproxy username for HAProxy monitoring without a password for the sake
of simplicity. Note that for MySQL clustering, a new sst user must exist. We
will set up a new sstpassword password for node authentication:
mysql> GRANT USAGE ON *.* to sst@'%' IDENTIFIED BY 'sstpassword';
mysql> GRANT ALL PRIVILEGES on *.* to sst@'%';
mysql> GRANT USAGE on *.* to galera@'%' IDENTIFIED BY
'galerapass';
mysql> INSERT INTO mysql.user (host,user) values ('%','haproxy');
mysql> FLUSH PRIVILEGES;
mysql> quit
12. Configure the MySQL wresp Galera library in each MySQL node in /etc/
mysql/conf.d/wsrep.cnf.
For db01.packtpub.com, add this code:
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address="gcomm://"
wsrep_sst_method=rsync
wsrep_sst_auth=sst:sstpass
[ 191 ]
From the MySQL command line, set your global MySQL settings as follows:
mysql> set global wsrep_cluster_address='gcomm://192.168.1.140:4567';
13. Check whether the Galera replication is running the way it should be
running:
packtpub@db01$ mysql e "show status like 'wsrep%' "
Additional checks can be verified from the MySQL command line. In db01.
packtpub.com, you can run:
Mysql> show status like 'wsrep%';
|wsrep_cluster_size
| 3 |
|wsrep_cluster_status | Primary |
|wsrep_connected
| ON |
The wsrep_cluster_size node that shows value 3 means that our cluster
is aware of three connected nodes while the current node is designated as a
wsrep_cluster_status primary node.
Starting from step 9, you can add a new MySQL node and join the cluster.
Note that we have separated our MySQL cluster from the cloud controller,
which means that OpenStack services running in the former node, including
Keystone, Glance, Nova, and Cinder as well as Neutron nodes, need to point
to the right MySQL server. Remember that we are using HAProxy while VIP
is managed by Keepalived for MySQL high availability. Thus, you will need
to reconfigure the Virtual IP in each service, as follows:
[ 192 ]
Chapter 6
Nova: /etc/nova/nova.conf
sql_connection=mysql://nova:[email protected]/nova
Keystone: /etc/keystone/keystone.conf
sql_connection=mysql://keystone:[email protected]/
keystone
Glance: /etc/glance/glance-registry.conf
sql_connection=mysql://glance:[email protected]/glance
Neutron: /etc/neutron/plugins/openvswitch/ovs_neutron_
plugin.ini
sql_connection=mysql://neutron:[email protected]/
neutron
Cinder: /etc/cinder/cinder.conf
sql_connection=mysql://cinder:[email protected]/cinder
[ 193 ]
Implementing HA on RabbitMQ
In this setup, we will use a node to introduce minor changes to our RabbitMQ
instances running in cloud controller nodes. We will enable the mirrored option in
our RabbitMQ brokers. In this example, we assume that the RabbitMQ service is
running on three OpenStack cloud controller nodes, as follows:
VIP: 192.168.47.47
HAProxy01: 192.168.47.120
HAProxy02: 192.168.47.121
2. Set the rabbitmq group and user with 400 file permissions in both the
additional nodes:
packtpub@cc02$ sudo chown rabbitmq:rabbitmq\ /var/lib/rabbitmq/.
erlang.cookie
packtpub@cc02$ sudo chmod 400 /var/lib/rabbitmq/.erlang.cookie
packtpub@cc03$ sudo chown rabbitmq:rabbitmq\ /var/lib/rabbitmq/.
erlang.cookie
packtpub@cc03$ sudo chmod 400 /var/lib/rabbitmq/.erlang.cookie
[ 194 ]
Chapter 6
4. Now, it's time to form the cluster and enable the mirrored queue option.
Currently, all the three RabbitMQ brokers are independent and they are not
aware of each other. Let's instruct them to join one cluster unit. First, stop the
rabbimqctl daemon.
On the cc02 node, run these commands:
# rabbitmqctl stop_app
Stopping node 'rabbit@cc02' ...
...done.
# rabbitmqctl join-cluster rabbit@cc01
Clustering node 'rabbit@cc02' with 'rabbit@cc01' ...
...done.
# rabbitmqctl start_app
Starting node 'rabbit@cc02' ...
... done
5. Check the nodes in the cluster by running them from any RabbitMQ node:
# rabbitmqctl cluster_status
Cluster status of node 'rabbit@cc03' ...
[{nodes,[{disc,['rabbit@cc01','rabbit@cc02',
'rabbit@cc03']}]},
{running_nodes,['rabbit@cc01','rabbit@cc02',
'rabbit@cc03']},
{partitions,[]}]
...done.
[ 195 ]
6. The last step will instruct RabbitMQ to use mirrored queues. By doing this,
mirrored queues will enable both producers and consumers in each queue
to connect to any RabbitMQ broker so that they can access the same message
queues. The following command will sync all the queues across all cloud
controller nodes by setting an HA policy:
# rabbitmqctl set_policy HA '^(?!amq\.).*' '{"ha-mode":"all", "hasync-mode":"automatic" }'
Note that the previous command line settles a policy where all
queues are mirrored to all nodes in the cluster.
7. Edit its configuration file in each RabbitMQ cluster node to join the cluster on
restarting /etc/rabbitmq/rabbitmq.config:
[{rabbit,
[{cluster_nodes, {['rabbit@cc01', 'rabbit@cc02', 'rabbit@cc03'],
ram}}]}].
Using VIP to manage both HAProxy nodes as a proxy for RabbitMQ might
require you to configure each OpenStack service to use the 192.168.47.47
address and the 5670 port. Thus, you will need to reconfigure the RabbitMQ
settings in each service in the VIP, as the following:
Nova: /etc/nova/nova.conf:
# crudini --set
192.168.47.47
/etc/nova/nova.conf
DEFAULT rabbit_host
# crudini --set
5470
/etc/nova/nova.conf
DEFAULT rabbit_port
[ 196 ]
Chapter 6
Glance: /etc/glance/glance-api.conf:
# crudini --set /etc/glance/glance-api.conf
rabbit_host 192.168.47.47
DEFAULT
DEFAULT
Neutron: /etc/neutron/neutron.conf:
# crudini --set /etc/neutron/neutron.conf
host 192.168.47.47
DEFAULT rabbit_
# crudini --set
port 5470
DEFAULT rabbit_
/etc/neutron/neutron.conf
Cinder: /etc/cinder/cinder.conf:
# crudini --set /etc/cinder/cinder.conf
host 192.168.47.47
DEFAULT rabbit_
# crudini --set
port 5470
DEFAULT rabbit_
/etc/cinder/cinder.conf
[ 197 ]
Corosync allows any server to join a cluster using active-active or activepassive fault-tolerant configurations. You will need to choose an unused
multicast address and a port. Create a backup for the original Corosync
configuration file and edit /etc/corosync/ corosync.conf as follows:
# cp /etc/corosync/corosync.conf /etc/corosync/corosync.conf.bak
# nano /etc/corosync/corosync.conf
Interface {
ringnumber: 0
bindnetaddr: 192.168.47.0
mcastaddr: 239.225.47.10
mcastport: 4000
....}
[ 198 ]
Chapter 6
On cc01, we can set up a VIP that will be shared between the three servers.
We can use 192.168.47.48 as the VIP with a 3-second monitoring interval:
# crm configure primitive VIP ocf:heartbeat:IPaddr2 params \
ip=192.168.47.48 cidr_netmask=32 op monitor interval=3s
We can see that the VIP has been assigned to the cc01 node. Note that the
use of the VIP will be assigned to the next cloud controller if cc01 does not
show any response during 3 seconds:
# crm_mon -1
Online: [ cc01 cc02]
VIP
(ocf::heartbeat:IPaddr2):
[ 199 ]
Started cc01
Optionally, you can create a new directory to save all downloaded resource
agent scripts under /usr/lib/ocf/resource.d/openstack.
Creating a new VIP will require you to point OpenStack services
to the new virtual address. You can overcome such repetitive
reconfiguration by keeping both IP addresses of the cloud controller
and the VIP. In each cloud controller, ensure that you have
exported the needed environment variables as follows:
# export OS_AUTH_URL=http://192.168.47.48:5000/v2.0/
You can check whether the Pacemaker is aware of new RAs or not
by running this:
# crm ra info ocf:openstack:nova-api
[ 200 ]
Chapter 6
[ 201 ]
[ 202 ]
Chapter 6
p_scheduler (ocf::openstack:nova-scheduler):
Started cc01
p_nova-novnc (ocf::openstack:nova-vnc):
Started cc01
p_keystone (ocf::openstack:keystone):
Started cc01
p_glance-api (ocf::openstack:glance-api):
Started cc01
p_glance-registry (ocf::openstack:glance-registry):
Started cc01
p_neutron-server (ocf::openstack:neutron-server):
Started cc01
To use private and public IP addresses, you might need to create two
different VIPs. For example, you will have to define your endpoint as
follows:
keystone endpoint-create --region $KEYSTONE_REGION \
--service-id $service-id -publicurl \ 'http://PUBLIC_
VIP:9292' \
--adminurl 'http://192.168.47.48:9292' \
--internalurl 'http://192.168.47.48:9292'
Management network
Data network
[ 203 ]
[ 204 ]
Chapter 6
plugin_config= "/etc/neutron/metadata_agent.ini" \
op monitor interval="5s" timeout="5s
Summary
In this chapter, you learned some of the most important concepts about high
availability and failover. You also learned the different options available to build a
redundant OpenStack architecture with a robust resiliency. You will know how to
diagnose your OpenStack design by eliminating any SPOF across all services. We
highlighted different open source solutions out of the box to arm our OpenStack
infrastructure and make it as fault-tolerant as possible. Different technologies were
introduced, such as HAProxy, database replication such as Galera, Keepalived,
Pacemaker, and Corosync. This completes the first part of the book that aimed to
cover different architecture levels and several solutions to end up with an optimal
OpenStack solution for a medium and large infrastructure deployment.
Now that we have crystallized the high availability aspect in our private cloud, we
will focus on building a multinode OpenStack environment in the next chapter and
dive deeper into orchestrating it. You can call it my first production day.
[ 205 ]
www.PacktPub.com
Stay Connected: