Cloud Native Applications
Cloud Native Applications
Many of the designations used by manufacturers and sellers to distinguish their products are
claimed as trademarks. Where those designations appear in the book, and Manning
Publications was aware of a trademark claim, the designations have been printed in initial caps
or all caps.
Recognizing the importance of preserving what has been written, it is Manning’s policy to have
the books we publish printed on acid-free paper, and we exert our best efforts to that end.
Recognizing also our responsibility to conserve the resources of our planet, Manning books
are printed on paper that is at least 15 percent recycled and processed without the use of
elemental chlorine.
ISBN 9781617294310
Printed in the United States of America
1 2 3 4 5 6 7 8 9 10 - EBM - 21 20 19 18 17 16
DOCKER IN PRACTICE 34
Discovering Docker
Chapter 1 from Docker in Practice 35
MESOS IN ACTION 52
Introducing Mesos
Chapter 1 from Mesos in Action 53
RABBITMQ IN DEPTH 68
Foundational RabbitMQ
Chapter 1 from RabbitMQ in Depth 69
NETTY IN ACTION 84
Case studies, part 1
Chapter 14 from Netty in Action 85
index 109
iii
iv
W hat is the biggest advantage of using Amazon Web Services (AWS)? For
us it’s being able to automate every part of your cloud infrastructure. AWS offers
an API and lots of tools to launch, configure, modify, and delete computing,
storage and networking infrastructure. Our book, Amazon Web Services in Action,
provides a deep introduction into the most important services and architecture
principles. Chapter 1 answers the question: What is Amazon Web Services? You’ll
learn about the concepts behind AWS and gain a brief overview of what you can
do with AWS.
What is
Amazon Web Services?
Amazon Web Services (AWS) is a platform of web services offering solutions for
computing, storing, and networking, at different layers of abstraction. You can use
these services to host web sites, run enterprise applications, and mine tremendous
amounts of data. The term web service means services can be controlled via a web
interface. The web interface can be used by machines or by humans via a graphical
user interface. The most prominent services are EC2, which offers virtual servers,
and S3, which offers storage capacity. Services on AWS work well together; you can
use them to replicate your existing on-premises setup or design a new setup from
scratch. Services are charged for on a pay-per-use pricing model.
As an AWS customer, you can choose among different data centers. AWS data cen-
ters are distributed in the United States, Europe, Asia, and South America. For exam-
ple, you can start a virtual server in Japan in the same way you can start a virtual server
in Ireland. This enables you to serve customers worldwide with a global infrastructure.
The map in figure 1.1 shows the data centers available to all customers.
Germany
Ireland
U.S. West 1
Singapore
Australia
Brazil
1
Bernard Golden, “Amazon Web Services (AWS) Hardware,” For Dummies, http://mng.bz/k6lT.
User
Internet
On-premises server
Web
Database
server
Web server Database
Maintenance free Managed by you with updates, Figure 1.2 Running a web shop
monitoring, and so on on-premises vs. on AWS
John realized that other options are available to improve his setup on AWS with addi-
tional services:
The web shop consists of dynamic content (such as products and their prices)
and static content (such as the company logo). By splitting dynamic and static
content, John reduced the load for his web servers and improved performance
by delivering the static content over a content delivery network (CDN).
John uses maintenance-free services including a database, an object store, and a
DNS system on AWS. This frees him from managing these parts of the system,
decreases operational costs, and improves quality.
The application running the web shop can be installed on virtual servers. John
split the capacity of the old on-premises server into multiple smaller virtual serv-
ers at no extra cost. If one of these virtual servers fails, the load balancer will
send customer requests to the other virtual servers. This setup improves the web
shop’s reliability.
Figure 1.3 shows how John enhanced the web shop setup with AWS.
John started a proof-of-concept project and found that his web application can be
transferred to AWS and that services are available to help improve his setup.
Internet
User
Improve Improve
reliability performance
Dynamic Static
Decrease
maintenance Web server
costs
Object store
Database
Figure 1.3 Running a web shop on AWS with CDN for better performance, a load balancer for
high availability, and a managed database to decrease maintenance costs
To do so, she defines a virtual network in the cloud and connects it to the corpo-
rate network through a virtual private network (VPN) connection. The company
can control access and protect mission-critical data by using subnets and control
traffic between them with access-control lists. Maureen controls traffic to the
internet using Network Address Translation (NAT) and firewalls. She installs
application servers on virtual machines (VMs) to run the Java EE application. Mau-
reen is also thinking about storing data in a SQL database service (such as Oracle
Database Enterprise Edition or Microsoft SQL Server EE). Figure 1.4 illustrates Mau-
reen’s architecture.
Maureen has managed to connect the on-premises data center with a private net-
work on AWS. Her team has already started to move the first enterprise application to
the cloud.
Virtual network
10.10.0.0/16
Private subnet
10.10.0.0/24
Internet
Internet
gateway NAT
Private subnet
VPN 10.10.1.0/24
VPN
Corporate network gateway Java EE server
10.20.0.0/16
Private subnet
10.10.2.0/24
SQL database
offer the possibility of sharing documents within the office. Storing all the data is a
challenge for him:
He needs to back up all files to prevent the loss of critical data. To do so, Greg
copies the data from the file server to another network-attached storage, so he
had to buy the hardware for the file server twice. The file server and the backup
server are located close together, so he is failing to meet disaster-recovery
requirements to recover from a fire or a break-in.
To meet legal and business data archival requirements, Greg needs to store data
for a long time. Storing data for 10 years or longer is tricky. Greg uses an expen-
sive archive solution to do so.
To save money and increase data security, Greg decided to use AWS. He transferred
data to a highly available object store. A storage gateway makes it unnecessary to buy
and operate network-attached storage and a backup on-premises. A virtual tape deck
takes over the task of archiving data for the required length of time. Figure 1.5
shows how Greg implemented this use case on AWS and compares it to the
on-premises solution.
Greg is fine with the new solution to store and archive data on AWS because he was
able to improve quality and he gained the possibility of scaling storage size.
Backup Archive
Internet
Data center A
Database
(master)
Web server
Internet
User Data center B
Load
balancer
Database
(standby)
Web server
1
Greg Bensinger, “Amazon Conference Showcases Another Side of the Retailer’s Business,” Digits, Nov. 12, 2014,
http://mng.bz/hTBo.
2
“Amazon.com’s Management Discusses Q1 2014 Results - Earnings Call Transcript,” Seeking Alpha, April 24, 2014,
http://mng.bz/60qX.
System load
System load
Flexible capacity also means you can shut down unused systems. In one of our last proj-
ects, the test environment only ran from 7:00 a.m. to 8:00 p.m. on weekdays, allowing
us to save 60%.
1.3.8 Worldwide
You can deploy your applications as close to your customers as possible. AWS has data
centers in the following locations:
United States (northern Virginia, northern California, Oregon)
Europe (Germany, Ireland)
Asia (Japan, Singapore)
Australia
South America (Brazil)
With AWS, you can run your business all over the world.
FedRAMP & DoD CSM —Ensures secure cloud computing for the U.S. Federal
Government and the U.S. Department of Defense
PCI DSS Level 1 —A data security standard (DSS) for the payment card industry
(PCI ) to protect cardholders data
ISO 9001 —A standardized quality management approach used worldwide and
certified by an independent and accredited certification body
If you’re still not convinced that AWS is a professional partner, you should know that
Airbnb, Amazon, Intuit, NASA, Nasdaq, Netflix, SoundCloud, and many more are run-
ning serious workloads on AWS.
The cost benefit is elaborated in more detail in the next section.
Internet
User
Dynamic Static
Web server
Object
storage
Database
Let’s assume your web shop started successfully in January, and you decided to run a
marketing campaign to increase sales for the next month. Lucky you: you were able to
increase the number of visitors of your web shop fivefold in February. As you already
know, you have to pay for AWS based on usage. Table 1.1 shows your bills for January
and February. The number of visitors increased from 100,000 to 500,000, and your
monthly bill increased from 142.37 USD to 538.09 USD, which is a 3.7-fold increase.
Because your web shop had to handle more traffic, you had to pay more for services,
such as the CDN, the web servers, and the database. Other services, like the storage of
static files, didn’t experience more usage, so the price stayed the same.
With AWS, you can achieve a linear relationship between traffic and costs. And
other opportunities await you with this pricing model.
Table 1.1 How an AWS bill changes if the number of web shop visitors increases
Load balancer 748 hours + 748 hours + 20.30 USD 1.60 USD
50 GB traffic 250 GB traffic
Web servers 1 server = 748 4 servers = 2,992 204.96 USD 153.72 USD
hours hours
Database (748 Small server + Large server + 170.66 USD 128.10 USD
hours) 20 GB storage 20 GB storage
Otherwise, AWS is your best bet because the chances are highest that you’ll find a solu-
tion for your problem.
Following are some common features of cloud providers:
Virtual servers (Linux and Windows)
Object store
Load balancer
Message queuing
Graphical user interface
Command-line interface
The more interesting question is, how do cloud providers differ? Table 1.2 compares
AWS, Azure, Google Cloud Platform, and OpenStack.
Table 1.2 Differences between AWS, Microsoft Azure, Google Cloud Platform, and OpenStack
Google Cloud
AWS Azure OpenStack
Platform
Compliance Common standards Common standards Common standards Yes (depends on the
(ISO 27001, HIPAA, (ISO 27001, HIPAA, (ISO 27001, HIPAA, OpenStack provider)
FedRAMP, SOC), IT FedRAMP, SOC), ISO FedRAMP, SOC)
Grundschutz (Ger- 27018 (cloud pri-
many), G-Cloud (UK) vacy), G-Cloud (UK)
Integration into Medium, not linked High, linked to the High, linked to the -
development to specific ecosys- Microsoft ecosys- Google ecosystem
process tems tem (for example, (for example,
.NET development) Android)
Relational Yes (MySQL, Post- Yes (Azure SQL Data- Yes (MySQL) Yes (depends on the
database greSQL, Oracle Data- base, Microsoft SQL OpenStack provider)
base, Microsoft SQL Server)
Server)
Table 1.2 Differences between AWS, Microsoft Azure, Google Cloud Platform, and OpenStack (continued)
Google Cloud
AWS Azure OpenStack
Platform
In our opinion, AWS is the most mature cloud platform available at the moment.
Manage
services
Services
API
Administrator Compute: Virtual server
App: Queues, search
Enterprise: Directory service, mail
Deployment: Access rights, monitoring
Storage: Object store, archiving
Database: Relational, NoSQL
Networking: DNS, virtual network
Software
Hardware
Compute Storage
Network
Figure 1.9 The AWS cloud is composed of hardware and software services accessible via an API.
Administrator
Manage
services
API
Services
Virtual
server
Figure 1.10 Managing a custom application running on a virtual server and dependent services
access. This means you can install any software you like on a virtual server. Other ser-
vices, like the NoSQL database service, offer their features through an API and hide
everything that’s going on behind the scenes. Figure 1.10 shows an administrator
installing a custom PHP web application on a virtual server and managing dependent
services such as a NoSQL database used by the PHP web application.
Users send HTTP requests to a virtual server. A web server is installed on this virtual
server along with a custom PHP web application. The web application needs to talk to
AWS services in order to answer HTTP requests from users. For example, the web
application needs to query data from a NoSQL database, store static files, and send
email. Communication between the web application and AWS services is handled by
the API, as figure 1.11 shows.
The number of different services available can be scary at the outset. The following
categorization of AWS services will help you to find your way through the jungle:
Compute services offer computing power and memory. You can start virtual serv-
ers and use them to run your applications.
App services offer solutions for common use cases like message queues, topics,
and searching large amounts of data to integrate into your applications.
Enterprise services offer independent solutions such as mail servers and directory
services.
Deployment and administration services work on top of the services mentioned so
far. They help you grant and revoke access to cloud resources, monitor your vir-
tual servers, and deploy applications.
Storage is needed to collect, persist, and archive data. AWS offers different stor-
age options: an object store or a network-attached storage solution for use with
virtual servers.
Database storage has some advantages over simple storage solutions when you
need to manage structured data. AWS offers solutions for relational and NoSQL
databases.
Networking services are an elementary part of AWS. You can define private net-
works and use a well-integrated DNS.
Be aware that we cover only the most important categories and services here. Other
services are available, and you can also run your own applications on AWS.
Now that we’ve looked at AWS services in detail, it’s time for you to learn how to
interact with those services.
Users
HTTP request
API Services
Virtual
server
Figure 1.11 Handling an HTTP request with a custom web application using additional
AWS services
Manual
Command-
Web-based
line interface
management
Console
Automation
SDKs for Java,
Python, JavaScript,...
API
Blueprints
Services
The CLI is typically used to automate tasks on AWS. If you want to automate parts of
your infrastructure with the help of a continuous integration server like Jenkins, the
CLI is the right tool for the job. The CLI offers a convenient way to access the API and
combine multiple calls into a script.
You can even begin to automate your infrastructure with scripts by chaining multi-
ple CLI calls together. The CLI is available for Windows, Mac, and Linux, and there’s
also a PowerShell version available.
1.7.3 SDKs
Sometimes you need to call AWS from within your application. With SDKs, you can use
your favorite programming language to integrate AWS into your application logic. AWS
provides SDKs for the following:
Android Node.js (JavaScript)
Browsers (JavaScript) PHP
iOS Python
Java Ruby
.NET Go
SDKs are typically used to integrate AWS services into applications. If you’re doing soft-
ware development and want to integrate an AWS service like a NoSQL database or a
push-notification service, an SDK is the right choice for the job. Some services, such as
queues and topics, must be used with an SDK in your application.
1.7.4 Blueprints
A blueprint is a description of your system containing all services and dependencies. The
blueprint doesn’t say anything about the necessary steps or the order to achieve the
described system. Figure 1.15 shows how a blueprint is transferred into a running system.
DNS CDN
{
infrastructure: {
loadbalancer: {
server: { ... }
}, Tool
cdn: { ... },
database: { ... },
Load balancer Static files
dns: { ... },
static: { ... }
}
}
1.8.1 Signing up
The sign-up process consists of five steps:
1 Provide your login credentials.
2 Provide your contact information.
3 Provide your payment details.
4 Verify your identity.
5 Choose your support plan.
Point your favorite modern web browser to https://aws.amazon.com, and click the
Create a Free Account / Create an AWS Account button.
1. PROVIDING YOUR LOGIN CREDENTIALS
The Sign Up page, shown in figure 1.16, gives you two choices. You can either create
an account using your Amazon.com account or create an account from scratch. If you
create the account from scratch, follow along. Otherwise, skip to step 5.
Fill in your email address, and select I Am a New User. Go on to the next step to cre-
ate your login credentials. We advise you to choose a strong password to prevent misuse
After you complete the first part, you’ll receive a call from AWS. A robot voice will ask
you for your PIN, which will be like the one shown in figure 1.20. Your identity will be
verified, and you can continue with the last step.
1.8.2 Signing In
You have an AWS account and are ready to sign in to the AWS Management Console at
https://console.aws.amazon.com. As mentioned earlier, the Management Console is
a web-based tool you can use to control AWS resources. The Management Console
uses the AWS API to make most of the functionality available to you. Figure 1.22 shows
the Sign In page.
Enter your login credentials and click Sign In Using Our Secure Server to see the
Management Console, shown in figure 1.23.
The most important part is the navigation bar at the top; see figure 1.24. It consists of
six sections:
AWS —Gives you a fast overview of all resources in your account.
Services —Provides access to all AWS services.
Custom section (Edit) —Click Edit and drag-and-drop important services here to
personalize the navigation bar.
Your name —Lets you access billing information and your account, and also lets
you sign out.
Your region —Lets you choose your region. You’ll learn about regions in section
3.5. You don’t need to change anything here now.
Support —Gives you access to forums, documentation, and a ticket system.
Next, you’ll create a key pair so you can connect to your virtual servers.
features that you can access via the navigation bar. The second column gives you a brief
overview of all your EC2 resources. The third column provides additional information.
Follow these steps to create a new key pair:
1 Click Key Pairs in the navigation bar under Network & Security.
2 Click the Create Key Pair button on the page shown in figure 1.26.
3 Name the Key Pair mykey. If you choose another name, you must replace the
name in all the following examples!
During key-pair creation, you downloaded a file called mykey.pem. You must now pre-
pare that key for future use. Depending on your operating system, you may need to do
things differently, so please read the section that fits your OS.
Figure 1.27 PuTTYgen allows you to convert the downloaded .pem file into the .pkk
file format needed by PuTTY.
1.9 Summary
Amazon Web Services (AWS) is a platform of web services offering solutions for
computing, storing, and networking that work well together.
Cost savings aren’t the only benefit of using AWS. You’ll also profit from an
innovative and fast-growing platform with flexible capacity, fault-tolerant ser-
vices, and a worldwide infrastructure.
Any use case can be implemented on AWS, whether it’s a widely used web appli-
cation or a specialized enterprise application with an advanced networking
setup.
You can interact with AWS in many different ways. You can control the different ser-
vices by using the web-based GUI; use code to manage AWS programmatically from
the command line or SDKs; or use blueprints to set up, modify, or delete your infra-
structure on AWS.
Pay-per-use is the pricing model for AWS services. Computing power, storage,
and networking services are billed similarly to electricity.
Creating an AWS account is easy. Now you know how to set up a key pair so you
can log in to virtual servers for later use.
What's inside
Overview of AWS cloud concepts and best practices
Manage servers on EC2 for cost-effectiveness
Infrastructure automation with Infrastructure as Code (AWS CloudFormation)
Deploy applications on AWS
Store data on AWS: SQL, NoSQL, object storage and block storage
Integrate Amazon's pre-built services
Architect highly available and fault tolerant systems
Written for developers and DevOps engineers moving distributed applications to the
AWS platform.
A docker was a labourer who was responsible for load goods of all sizes and
shapes to a ship. Docker is also the name of a tool, helping you to deliver your
applications to all kinds of machines. By using standardization, Docker allows
you to automate packaging and deploying your applications. A nice side-effect is
that this automation works on your local development machine as well as on
cloud infrastructure. Docker in Practice helps you to get started with Docker. The
first chapter, “Discovering Docker,” explains the key concepts behind Docker.
Discovering Docker
Docker is a platform that allows you to “build, ship, and run any app, anywhere.” It
has come a long way in an incredibly short time and is now considered a standard
way of solving one of the costliest aspects of software: deployment.
Before Docker came along, the development pipeline typically consisted of
combinations of various technologies for managing the movement of software,
such as virtual machines, configuration management tools, different package man-
agement systems, and complex webs of library dependencies. All these tools
35
needed to be managed and maintained by specialist engineers, and most had their
own unique ways of being configured.
Docker has changed all of this, allowing different engineers involved in this pro-
cess to effectively speak one language, making working together a breeze. Everything
goes through a common pipeline to a single output that can be used on any target—
there’s no need to continue maintaining a bewildering array of tool configurations, as
shown in figure 1.1.
At the same time, there’s no need to throw away your existing software stack if it
works for you—you can package it up in a Docker container as-is for others to con-
sume. As a bonus, you can see how these containers were built, so if you need to dig
into the details, you can.
This book is aimed at intermediate developers with some knowledge of Docker. If
you’re OK with the basics, feel free to skip to the later chapters. The goal of this book
is to expose the real-world challenges that Docker brings and show how they can be
overcome. But first we’re going to provide a quick refresher on Docker itself. If you
want a more thorough treatment of Docker’s basics, take a look at Docker in Action by
Jeff Nickoloff (Manning Publications, 2016).
In chapter 2 you’ll be introduced to Docker’s architecture more deeply with the
aid of some techniques that demonstrate its power. In this chapter you’re going to
learn what Docker is, see why it’s important, and start using it.
Release
Development Test Live Development Testing
to live
Figure 1.1 How Docker has eased the tool maintenance burden
Teams of dockers
required to load
differently shaped
items onto ship
Only one docker needed to Single container with different items in it. It
operate machines designed doesn't matter to the carrier what's inside the
to move containers container. The carrier can be loaded up elsewhere,
reducing the bottleneck of loading at port.
This should sound familiar to anyone working in software. Much time and intellectual
energy is spent getting metaphorically odd-shaped software into different sized meta-
phorical ships full of other odd-shaped software, so they can be sold to users or busi-
nesses elsewhere.
Figure 1.3 shows how time and money can be saved with the Docker concept.
Before Docker, deploying software to different environments required significant
effort. Even if you weren’t hand-running scripts to provision software on different
machines (and plenty of people still do exactly that), you’d still have to wrestle with con-
figuration management tools that manage state on what are increasingly fast-moving
environments starved of resources. Even when these efforts were encapsulated in VMs,
a lot of time was spent managing the deployment of these VMs, waiting for them to boot,
and managing the overhead of resource use they created.
With Docker, the configuration effort is separated from the resource management,
and the deployment effort is trivial: run docker run, and the environment’s image is
pulled down and ready to run, consuming fewer resources and contained so that it
doesn’t interfere with other environments.
You don’t need to worry about whether your container is going to be shipped to a
RedHat machine, an Ubuntu machine, or a CentOS VM image; as long as it has
Docker on it, it’ll be good to go.
Install, configure,
docker run
and maintain complex Dev laptop
application
docker run
A single effort Docker image Test server
to manage
deployment
docker run
Live server
Containers: A container is a
running instance of an image.
You can have multiple containers
running from the same image.
Stored Running
on disk processes
MyApplication
container (v1) 1
Images: An image is a
collection of filesystem
layers and some metadata. Debian layer: MyApplication
/bin container (v1) 2
Taken together, they can be
/boot
spun up as Docker …
containers. /tmp MyApplication
/var container (v1) 3
Layers: A layer is a MyApplication
collection of changes code layer
to files. The differences
between v1 and v2 of MyApplication MyApplication
MyApplication are v2 layer container (v2) 1
stored in this layer.
It’s most useful to get the concepts of images, containers, and layers clear in your
mind before you start running Docker commands. In short, containers are running sys-
tems defined by images. These images are made up of one or more layers (or sets of
diffs) plus some metadata for Docker.
Let’s look at some of the core Docker commands. We’ll turn images into contain-
ers, change them, and add layers to new images that we’ll commit. Don’t worry if all of
this sounds confusing. By the end of the chapter it will all be much clearer!
KEY DOCKER COMMANDS
Docker’s central function is to build, ship, and run software in any location that has
Docker.
To the end user, Docker is a command-line program that you run. Like git (or
any source control tool), this program has subcommands that perform different
operations.
The principal Docker subcommands you’ll use on your host are listed in table 1.1.
Command Purpose
Diffs from Ubuntu image: Diffs from Ubuntu image: Diffs from Ubuntu image:
MODIFIED: /opt/app/nodejs.log DELETE: /etc/nologin ADDED: //var/log/apache/apache.log
Changes to files are stored Containers are created from images, inherit
within the container in a their filesystems, and use their metadata to
copy-on-write mechanism. determine their startup configuration.
The base image cannot be Containers are separate but can be
affected by a container. configured to communicate with
each other.
ToDoApp
Dockerfile
Git
repository
My Your
server server
Build Build
ToDoApp ToDoApp
Docker image Docker image
Figure 1.6 Building a
Docker application
THE TO-DO APPLICATION This to-do application will be used a few times through-
out the book, and it’s quite a useful one to play with and demonstrate, so it’s
worth familiarizing yourself with it.
Docker commands / “By Fire up a container with docker run and input the See technique 14.
hand” commands to create your image on the command
line. Create a new image with docker commit.
Dockerfile Build from a known base image, and specify build Discussed shortly.
with a limited set of simple commands.
Dockerfile and configuration Same as Dockerfile, but hand over control of the See technique 47.
management (CM) tool build to a more sophisticated CM tool.
Scratch image and import a From an empty image, import a TAR file with the See technique 10.
set of files required files.
The first “by hand” option is fine if you’re doing proofs of concept to see whether
your installation process works. At the same time, you should be keeping notes about
the steps you’re taking so that you can return to the same point if you need to.
At some point you’re going to want to define the steps for creating your image.
This is the second option (and the one we’ll use here).
For more complex builds, you may want to go for the third option, particularly
when the Dockerfile features aren’t sophisticated enough for your image’s needs.
The final option builds from a null image by overlaying the set of files required to
run the image. This is useful if you want to import a set of self-contained files created
elsewhere, but it’s rarely seen in mainstream use.
We’ll look at the Dockerfile method now; the other methods will be covered later
in the book.
B Define the
base image. C Declare the
Clone the D FROM node maintainer.
todoapp MAINTAINER [email protected]
code. RUN git clone -q https://github.com/docker-in-practice/todo.git
WORKDIR todo
RUN npm install > /dev/null Run the node package
Move to the EXPOSE 8000 manager’s install
new cloned
directory. E
CMD ["npm","start"]
F
command (npm).
You begin the Dockerfile by defining the base image with the FROM command B. This
example uses a Node.js image so you have access to the Node.js binaries. The official
Node.js image is called node.
Next, you declare the maintainer with the MAINTAINER command C. In this case,
we’re using one of our email addresses, but you can replace this with your own
reference because it’s your Dockerfile now. This line isn’t required to make a
working Docker image, but it’s good practice to include one. At this point, the build
has inherited the state of the node container, and you’re ready to work on top of it.
Next, you clone the todoapp code with a RUN command D. This uses the specified
command to retrieve the code for the application, running git within the container.
Git is installed inside the base node image in this case, but you can’t take this kind of
thing for granted.
Now you move to the new cloned directory with a WORKDIR command E. Not only
does this change directory within the build context, but the last WORKDIR command
determines which directory you’re in by default when you start up your container
from your built image.
Next, you run the node package manager’s install command (npm) F. This will set
up the dependencies for your application. You aren’t interested in the output here, so
you redirect it to /dev/null.
Because port 8000 is used by the application, you use the EXPOSE command to tell
Docker that containers from the built image should listen on this port G.
Finally, you use the CMD command to tell Docker which command will be run on
startup of the container H.
This simple example illustrates several key features of Docker and Dockerfiles. A
Dockerfile is a simple sequence of a limited set of com-
mands run in strict order. They affect the files and meta- The docker Path to the
command Dockerfile file
data of the resulting image. Here the RUN command
affects the filesystem by checking out and installing appli-
cations, and the EXPOSE, CMD, and WORKDIR commands
docker build .
affect the metadata of the image.
Docker uploads
the files and Each build step is
directories under numbered sequentially
the path supplied Sending build context to Docker daemon 178.7 kB
Sending build context to Docker daemon from 0 and output with
to the docker the command.
build command. Step 0 : FROM node
---> fc81e574af43
Step 1 : MAINTAINER [email protected]
---> Running in 21af1aad6950
To save space, each
Each command intermediate container is
---> 8f32669fe435
results in a new removed before continuing.
Removing intermediate container 21af1aad6950
image being Step 2 : RUN git clone https://github.com/ianmiell/todo.git
created, and the ---> Running in 0a030ee746ea
image ID is output. Cloning into 'todo'...
---> 783c68b2e3fc
Removing intermediate container 0a030ee746ea
Step 3 : WORKDIR todo
---> Running in 2e59f5df7152
---> 8686b344b124
Removing intermediate container 2e59f5df7152
Step 4 : RUN npm install
Debug of the build is
output here (and edited
---> Running in bdf07a308fca
out of this listing).
npm info it worked if it ends with ok
[...]
npm info ok
---> 6cf8f3633306
Removing intermediate container bdf07a308fca
Step 5 : RUN chmod -R 777 /todo
---> Running in c03f27789768
---> 2c0ededd3a5e
Removing intermediate container c03f27789768
Step 6 : EXPOSE 8000
---> Running in 46685ea97b8f
---> f1c29feca036
Removing intermediate container 46685ea97b8f
Step 7 : CMD npm start
---> Running in 7b4c1a9ed6af
---> 439b172f994e
Final image ID for this
Removing intermediate container 7b4c1a9ed6af
build, ready to tag
Successfully built 439b172f994e
npm install
npm info it worked if it ends with ok
npm info using [email protected]
npm info using [email protected]
npm WARN package.json [email protected] No repository field.
npm WARN package.json [email protected] license should be a
➥ valid SPDX license expression
npm info preinstall [email protected]
npm info package.json [email protected] license should be a valid
➥ SPDX license expression
npm info package.json [email protected] No license field.
npm info package.json [email protected] No license field.
npm info package.json [email protected] license should be a valid
➥ SPDX license expression
npm info package.json [email protected] No license field.
npm info build /todo
npm info linkStuff [email protected]
npm info install [email protected]
npm info postinstall [email protected]
npm info prepublish [email protected]
npm info ok
if [ ! -e dist/ ]; then mkdir dist; fi
cp node_modules/react/dist/react.min.js dist/react.min.js
The docker run subcommand starts up the container B. The -p flag maps the con-
tainer’s port 8000 to the port 8000 on the host machine, so you should now be able to
navigate with your browser to http://localhost:8000 to view the application. The - -name
flag gives the container a unique name you can refer to later for convenience. The last
argument is the image name.
Once the container was started, we hit CTRL-C to terminate the process and the
container C. You can run the ps command to see the containers that have been
started but not removed D. Note that each container has its own container ID and sta-
tus, analogous to a process. Its status is Exited, but you can restart it E. After you do,
notice how the status has changed to Up and the port mapping from container to host
machine is now displayed F.
The docker diff subcommand shows you which files have been affected since the
image was instantiated as a container G. In this case, the todo directory has been
changed H and the other listed files have been added I. No files have been deleted,
which is the other possibility.
As you can see, the fact that Docker “contains” your environment means that you
can treat it as an entity on which actions can be predictably performed. This gives
Docker its breadth of power—you can affect the software lifecycle from development
to production and maintenance. These changes are what this book will cover, showing
you in practical terms what can be done with Docker.
Next you’re going to learn about layering, another key concept in Docker.
(see figure 1.9). Whenever a running container needs to write to a file, it records the
change by copying the item to a new area of disk. When a Docker commit is performed,
this new area of disk is frozen and recorded as a layer with its own identifier.
This partly explains how Docker containers can start up so quickly—they have
nothing to copy because all the data has already been stored as the image.
Figure 1.10 illustrates that the to-do app you’ve built has three layers you’re interested in.
Name: todoapp
Image ID: bd0921d1
Size: 600k Your to-do app’s
files on top
Name: node
Image ID: efc12dea
Three layers together Size: 1.5M Node binaries and files
make the to-do image. are added in this layer.
Because the layers are static, you only need build on top of the image you wish to take
as a reference, should you need anything to change in a higher layer. In the to-do app,
you built from the publicly available node image and layered changes on top.
All three layers can be shared across multiple running containers, much as a
shared library can be shared in memory across multiple running processes. This is a
vital feature for operations, allowing the running of numerous containers based on
different images on host machines without running out of disk space.
Imagine that you’re running the to-do app as a live service for paying customers.
You can scale up your offering to a large number of users. If you’re developing, you
can spin up many different environments on your local machine at once. If you’re
moving through tests, you can run many more tests simultaneously, and far more
quickly than before. All these things are made possible by layering.
By building and running an application with Docker, you’ve begun to see the
power that Docker can bring to your workflow. Reproducing and sharing specific envi-
ronments and being able to land these in various places gives you both flexibility and
control over development.
1.3 Summary
Depending on your previous experience with Docker, this chapter might have been a
steep learning curve. We’ve covered a lot of ground in a short time.
You should now
Understand what a Docker image is
Know what Docker layering is, and why it’s useful
Be able to commit a new Docker image from a base image
Know what a Dockerfile is
We’ve used this knowledge to
Create a useful application
Reproduce state in an application with minimal effort
Next we’re going to introduce techniques that will help you understand how Docker
works and, from there, discuss some of the broader technical debate around Docker’s
usage. These first two introductory chapters form the basis for the remainder of the
book, which will take you from development to production, showing you how Docker
can be used to improve your workflow.
What's inside
Speed up your DevOps pipeline with the use of containers
Reduce the effort of maintaining and configuring software
Using Docker to cheaply replace VMs
Streamlining your cloud workflow
Using the Docker Hub and its workflow
Navigating the Docker ecosystem
Written for developers and devops engineers who have already started their Docker
journey and want to use it effectively in a production setting.
Introducing Mesos
53
cations with the resources they need, without the overhead of virtual machines and
operating systems. You can see a simplified example of this in figure 1.1.
Mesos offers
available cluster
Spark Jenkins CI Marathon Chronos resources directly
to frameworks.
OS OS OS OS OS OS OS
The OS kernel
provides access to
underlying physical
or virtual resources
Server Server Server Server Server Server Server
(CPU, memory, disk)
This book introduces Apache Mesos, an open source cluster manager that allows sys-
tems administrators and developers to focus less on individual servers and more on
the applications that run on them. You’ll see how to get up and running with Mesos in
your environment, how it shares resources and handles failure, and—perhaps most
important—how to use it as a platform to deploy applications.
In the next few sections, you’re going to look at how Mesos works to provide all of
these features and how it compares to a traditional datacenter.
Mesos slave
Mesos master
Container
Spark scheduler:
User-submitted Spark job
Do I have work to do?
Yes No
Accept Reject
Figure 1.2 Mesos advertises the available CPU, memory, and disk as resource offers to
frameworks.
B The Mesos slave offers its available CPU, memory, and disk to the Mesos master
in the form of a resource offer.
c The Mesos master’s allocation module—or scheduling algorithm—decides which
frameworks—or applications—to offer the resources to.
d In this particular case, the Spark scheduler doesn’t have any jobs to run on the
cluster. It rejects the resource offer, allowing the master to offer the resources to
another framework that might have some work to do.
e Now consider a user submitting a Spark job to be run on the cluster. The sched-
uler accepts the job and waits for a resource offer that satisfies the workload.
f The Spark scheduler accepts a resource offer from the Mesos master, and
launches one or more tasks on an available Mesos slave. These tasks are launched
within a container, providing isolation between the various tasks that might be
running on a given Mesos slave.
Seems simple, right? Now that you’ve learned how Mesos uses resource offers to adver-
tise resources to frameworks, and how two-tier scheduling allows frameworks to accept
and reject resource offers as needed, let’s take a closer look at some of these funda-
mental concepts.
NOTE An effort is underway to rename the Mesos slave role to agent for future
versions of Mesos. Because this book covers Mesos 0.22.2, it uses the terminol-
ogy of that specific release, so as to not create any unnecessary confusion. For
more information, see https://issues.apache.org/jira/browse/MESOS-1478.
RESOURCE OFFERS
Like many other cluster managers, Mesos clusters are made up of groups of machines
called masters and slaves. Each Mesos slave in a cluster advertises its available CPU,
memory, and storage in the form of resource offers. As you saw in figure 1.2, these
resource offers are periodically sent from the slaves to the Mesos masters, processed
by a scheduling algorithm, and then offered to a framework’s scheduler running on
the Mesos cluster.
TWO-TIER SCHEDULING
In a Mesos cluster, resource scheduling is the responsibility of the Mesos master’s allo-
cation module and the framework’s scheduler, a concept known as two-tier scheduling.
As previously demonstrated, resource offers from Mesos slaves are sent to the master’s
allocation module, which is then responsible for offering resources to various frame-
work schedulers. The framework schedulers can accept or reject the resources based
on their workload.
The allocation module is a pluggable component of the Mesos master that
implements an algorithm to determine which offers are sent to which frameworks
(and when). The modular nature of this component allows systems engineers to
implement their own resource-sharing policies for their organization. By default,
Mesos uses an algorithm developed at UC Berkeley known as Dominant Resource
Fairness (DRF):
In a nutshell, DRF seeks to maximize the minimum dominant share across all users. For
example, if user A runs CPU-heavy tasks and user B runs memory-heavy tasks, DRF
attempts to equalize user A’s share of CPUs with user B’s share of memory. In the single-
resource case, DRF reduces to max-min fairness for that resource.1
1
A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica. “Dominant Resource Fairness:
Fair Allocation of Multiple Resource Types.” NSDI, vol. 11, 2011.
Mesos’s use of the DRF algorithm by default is fine for most deployments. Chances are
you won’t need to write your own allocation algorithm, so this book doesn’t go into
much detail about DRF. If you’re interested in learning more about this research, you
can find the paper online at www.usenix.org/legacy/events/nsdi11/tech/full_pa-
pers/Ghodsi.pdf.
RESOURCE ISOLATION
Using Linux cgroups or Docker containers to isolate processes, Mesos allows for mult-
itenancy, or for multiple processes to be executed on a single Mesos slave. A framework
then executes its tasks within the container, using a Mesos containerizer. If you’re not
familiar with containers, think of them as a lightweight approach to how a hypervisor
runs multiple virtual machines on a single physical host, but without the overhead or
need to run an entire operating system.
Now that you have a clearer understanding of how Mesos works, you can move on to
understanding how this technology compares to the traditional datacenter. More spe-
cifically, the next section introduces the concept of an application-centric datacenter,
where the focus is more on applications than on the servers and operating systems
that run them.
Virtual machine–based
application deployment
Container-based
App App App App application deployment
VM VM VM VM Mesos
Hypervisor Hypervisor OS OS
overhead of the operating system, which consumes added CPU, memory, and disk. At
a large-enough scale, this becomes wasteful. With an application-centric approach to
managing datacenters, Mesos allows you to simplify your stack—and your application
deployments—using lightweight containers.
CONTAINERS
As you learned previously, Mesos uses containers for resource isolation between
processes. In the context of Mesos, the two most important resource-isolation meth-
ods to know about are the control groups (cgroups) built into the Linux kernel,
and Docker.
Around 2007, support for control groups (referred to as cgroups throughout this
text) was made available in the Linux kernel, beginning with version 2.6.24. This
allows the execution of processes in a way that’s sandboxed from other processes. In the
context of Mesos, cgroups provide resource constraints for running processes, ensur-
ing that they don’t interfere with other processes running on the system. When using
cgroups, any packages or libraries that the tasks might depend on (a specific version
of Python, a working C++ compiler, and so on) must be already present on the host
operating system. If your workloads, packages, and required tools and libraries are
fairly standardized or don’t readily conflict with each other, this might not be a prob-
lem. But consider figure 1.4, which demonstrates how using Docker can overcome
these sorts of problems and allow you to run applications and workloads in a more iso-
lated manner.
Application
Application
Packages and
libraries
Application
Packages and
libraries
Guest OS Linux cgroup
Docker container
Packages and
VM Docker engine libraries
Using low-level primitives in the Linux kernel, including cgroups and namespaces,
Docker provides a means to build and deploy containers almost as if they were virtual
machines. The application and all of its dependencies are packaged within the con-
tainer and deployed atop a host operating system. They take a concept from the
freight industry—the standardized industrial shipping container—and apply this to
application deployment. In recent years, this new unit of software delivery has grown
in popularity as it’s generally considered to be more lightweight than deploying an
entire virtual machine.
You don’t need to understand all the implementation details and intricacies of
building and deploying containers to use Mesos, though. If you’d like more informa-
tion, please consult the following online resources:
Linux control groups: www.kernel.org/doc/documentation/cgroup-v1/cgrou-
ps.txt
Docker: https://docs.docker.com
NOTE This book covers Mesos version 0.22.2, which provides an environment
for running stateless and distributed applications. Beginning in version 0.23,
Mesos will begin work to support persistent resources, thus enabling support
for stateful frameworks. For more information on this effort, see
https://issues.apache.org/jira/browse/MESOS-1554.
For example, consider the stateless, distributed, and stateful technologies in table 1.1.
Table 1.1 Technologies that are—and aren’t—good candidates to run on Mesos
Distributed out of the box Cassandra, Elasticsearch, Hadoop Yes, provided the correct level of
Distributed File System (HDFS) redundancy is in place
The real value of Mesos is realized when running stateless services and applications—
applications that will handle incoming loads but that could go offline at any time with-
out negatively impacting the service as a whole, or services that run a job and report
the result to another system. As noted previously, examples of some of these applica-
tions include Ruby on Rails and Jenkins CI build slaves.
Progress has been made running distributed databases (such as Cassandra and
Elasticsearch) and distributed filesystems (such as Hadoop Distributed File System, or
HDFS) as Mesos frameworks. But this is feasible only if the correct level of redundancy
is in place. Although certain distributed databases and filesystems have data replica-
tion and fault tolerance built in, your data might not survive if the entire Mesos cluster
fails (because of natural disasters, redundant power/cooling systems failures, or human
error). In the real world, you should weigh the risks and benefits of deploying services
that persist data on a Mesos cluster.
As I mentioned earlier, Mesos excels at running stateless, distributed services.
Stateful applications that need to persist data to disk aren’t good candidates for run-
ning on Mesos as of this writing. Although possible, it’s not yet advisable to run certain
databases such as MySQL and PostgreSQL atop a Mesos cluster. When you do need to
persist data, it’s preferable to do so by deploying a traditional database cluster outside
the Mesos cluster.
This section presents two primary reasons that you should rethink how datacenters
are managed: the administrative overhead of statically partitioning resources, and the
need to focus more on applications instead of infrastructure.
X X Ruby on Rails
app server
Now consider solving the aforementioned scaling scenario by using Mesos, as shown
in figure 1.6. You can see that you’d use these same machines in the datacenter to
focus on running applications instead of virtual machines. The applications could run
on any machine with available resources. If you need to scale, you add servers to the
Mesos cluster, instead of adding machines to multiple clusters. If a single Mesos node
goes offline, no particular impact occurs to any one service.
X X Mesos slave
Docker containers. Instead of having multiple environments (one each for develop-
ment, staging, and production), the entire datacenter becomes a platform on which
to deploy applications.
Where Mesos is commonly referred to—and acts as—a distributed kernel, other
Mesos frameworks help users run long-running and scheduled tasks, similar to the
init and Cron systems, respectively. You’ll learn more about these frameworks (Mar-
athon, Chronos, and Aurora) and how to deploy applications on them later in
this book.
Consider the power of what I’ve described so far: Mesos provides fault tolerance
out of the box. Instead of a systems administrator getting paged when a single server
goes offline, the cluster will automatically start the failed job elsewhere. The sysadmin
needs to be concerned only if a certain percentage of machines goes offline in the
datacenter, as that might signal a larger problem. As such, with the correct placement
and redundancy in place, scheduled maintenance can occur at any time.
1.3.1 Masters
One or more Mesos masters are responsible for managing the Mesos slave daemons
running on each machine in the cluster. Using ZooKeeper, they coordinate which
node will be the leading master, and which masters will be on standby, ready to take
over if the leading master goes offline.
The leading master is responsible for deciding which resources to offer to a partic-
ular framework using a pluggable allocation module, or scheduling algorithm, to distrib-
ute resource offers to the various schedulers. The scheduler can then either accept or
reject the offer based on whether it has any work to be performed at that time.
A Mesos cluster requires a minimum of one master, and three or more are recom-
mended for production deployments to ensure that the services are highly available.
You can run ZooKeeper on the same machines as the Mesos masters themselves, or
use a standalone ZooKeeper cluster. Chapter 3 goes into more detail about the sizing
and deploying of Mesos masters.
Framework A Framework B
scheduler scheduler
Mesos master
(leader)
ZK
ZK ZK
Slave 1 Slave n
1.3.2 Slaves
The machines in a cluster responsible for executing a framework’s tasks are referred
to as Mesos slaves. They query ZooKeeper to determine the leading Mesos master and
advertise their available CPU, memory, and storage resources to the leading master in
the form of a resource offer. When a scheduler accepts a resource offer from the
Mesos master, it then launches one or more executors on the slave, which are responsi-
ble for running the framework’s tasks.
Mesos slaves can also be configured with certain attributes and resources, which
allow them to be customized for a given environment. Attributes refer to key/value
pairs that might contain information about the node’s location in a datacenter, and
resources allow a particular slave’s advertised CPU, memory, and disk to be overridden
with user-provided values, instead of Mesos automatically detecting the available
resources on the slave. Consider the following example attributes and resources:
--attributes='datacenter:pdx1;rack:1-1;os:rhel7'
--resources='cpu:24;mem:24576;disk:409600'
I’ve configured this particular Mesos slave to advertise its datacenter; location within
the datacenter; operating system; and user-provided CPU, memory, and disk resources.
This information is especially useful when trying to ensure that applications stay
online during scheduled maintenance. Using this information, a datacenter operator
could take an entire rack (or an entire row!) of machines offline for scheduled main-
tenance without impacting users. Chapter 4 covers this (and more) in the Mesos slave
configuration section.
1.3.3 Frameworks
As you learned earlier, a framework is the term given to any Mesos application that’s
responsible for scheduling and executing tasks on a cluster. A framework is made up
of two components: a scheduler and an executor.
SCHEDULER
A scheduler is typically a long-running service responsible for connecting to a Mesos
master and accepting or rejecting resource offers. Mesos delegates the responsibility
of scheduling to the framework, instead of attempting to schedule all the work for a
cluster itself. The scheduler can then accept or reject a resource offer based on
whether it has any tasks to run at the time of the offer. The scheduler detects the lead-
ing master by communicating with the ZooKeeper cluster, and then registers itself to
that master accordingly.
EXECUTOR
An executor is a process launched on a Mesos slave that runs a framework’s tasks on a
slave. As of this writing, the built-in Mesos executors allow frameworks to execute shell
scripts or run Docker containers. New executors can be written using Mesos’s various
language bindings and bundled with the framework, to be fetched by the Mesos slave
when a task requires it.
As you’ve learned, Mesos provides a distributed, highly available architecture. Mas-
ters schedule work to be performed on the cluster, and slaves advertise available
resources to the schedulers, which in turn execute tasks on the cluster.
1.4 Summary
In this chapter, you’ve been introduced to the Apache Mesos project, its architecture,
and how it attempts to solve scaling problems and make clustering simple. You’ve also
learned how Mesos deployments compare and contrast with the traditional datacen-
ter, and how an application-centric approach can lead to using resources more effi-
ciently. We’ve discussed when (and when not) to use Mesos for a given workload, and
where you can get help and find more information, should you need it. Here are a few
things to remember:
Mesos abstracts CPU, memory, and disk resources away from underlying systems
and presents multiple machines as a single entity.
Mesos slaves advertise their available CPUs, memory, and disk in the form of
resource offers.
A Mesos framework comprises two primary components: a scheduler and an
executor.
Containers are a lightweight method to provide resource isolation to individual
processes.
In the next chapter, I’ll walk you through a real-world example of how Mesos allows
for more efficient resource use, and how you might run applications in your own data-
center by building on projects in the Mesos ecosystem.
What's inside
Spinning up your first Mesos cluster
Deploying containerized applications on Mesos
Scheduling, resource administration, and logging
Deploy applications using the popular Marathon, Chronos, and Aurora frame-
works
Writing custom Mesos frameworks using Python
Readers need to be familiar with the core ideas of data center administration, includ-
ing networking, virtualization, and application deployment on Linux systems. The
Python-based code examples should be clear to readers using Mesos bindings for
other popular languages, including C++, Go, and Scala.
Foundational RabbitMQ
Whether your application is in the cloud or in your own data center, RabbitMQ is a
lightweight and extremely powerful tool for creating distributed software architec-
tures that ranges from the very simple to the incredibly complex. In this chapter
you will learn how RabbitMQ, as messaging-oriented middleware, allows tremen-
dous flexibility in how you approach and solve problems. You will learn about how
some companies are using it and key features that make RabbitMQ one of the most
popular message brokers today.
69
While we will explore the features on this list in later chapters, I would like to focus on
the two most foundational features of RabbitMQ: the language it is programmed in
(Erlang), and the model it is based on (the Advanced Message Queuing Model), a
specification that defines much of the RabbitMQ lexicon and behavior.
Erlang’s design around concurrent processing and message passing makes it a natural
choice for a message broker like RabbitMQ: As an application, a message broker
maintains concurrent connections, routes messages, and manages their state. In addi-
tion, Erlang’s distributed communication architecture makes it a natural for Rab-
bitMQ’s clustering mechanism. Servers in a RabbitMQ cluster make use of Erlang’s
inter-process communication (IPC) system, offloading the functionality that many com-
peting message brokers have to implement to add clustering capabilities (figure 1.1).
Despite the advantages RabbitMQ has using Erlang, the Erlang environment can
be a stumbling block. If this is your first foray into Erlang, check out Appendix B: Just
Enough Erlang. In the appendix, you will learn enough to be confident in managing
RabbitMQ’s configuration files and you will learn how to use Erlang to gather infor-
mation about RabbitMQ’s current runtime state.
Figure 1.1 RabbitMQ clusters use the native Erlang inter-process communication mechanism in
the Virtual Memory (VM) for cross-node communication, sharing state information and allowing for
messages to be published and consumed across the entire cluster.
NOTE There are multiple versions of the AMQP specification. For the pur-
poses this book, we will focus only on AMQP 0-9-1. While newer versions of
RabbitMQ support AMQP 1.0 as a plugin extension, the core RabbitMQ
architecture is more closely related to AMQP 0-8 and 0-9-1. The AMQP speci-
fication is primarily comprised of two documents, a top-level document that
describes both the AMQ model and the AMQ protocol and a more detailed
document that provides varying levels of information about every class,
method, property and field. More information about AMQP, including the
specification documents may be found at http://www.amqp.org.
There are multiple popular message brokers and messaging protocols, and it is impor-
tant that you consider the impact the protocol and broker will have on your applica-
tion. While RabbitMQ supports AMQP, it also supports other protocols, such as
MQTT, Stomp, and XMPP. RabbitMQ’s protocol neutrality and plugin extensibility
make it a good choice for multi-protocol application architectures when compared to
other popular message brokers.
It is RabbitMQ’s roots in the AMQP specification that outline its primary architec-
ture and communication methodologies. This is an important distinction when evalu-
ating RabbitMQ against other message brokers. As with AMQP, RabbitMQ set out to
be a vendor-neutral, platform-independent solution for the complex needs that mes-
saging oriented architectures demand, such as flexible message routing, configurable
message durability, and inter-datacenter communication, to name a few.
As you can see, RabbitMQ is not only used by some of the largest sites on the Internet,
it has found its way into academia for large scale scientific research, and NASA found
it fitting to use RabbitMQ at the core of their network infrastructure management
stack. As these examples show, RabbitMQ has been used in mission-critical applica-
tions in many different environments and industries with tremendous success.
Figure 1.2 Before: Once a user has logged in, each database is updated with a timestamp
sequentially and dependently. The more tables you add increases the time this takes.
As the site continued to grow, the amount of time it took for a member to login
also grew. The reason for this was fairly straightforward; when adding a new applica-
tion that used the member’s last login timestamp, its database tables would carry the
value to make it as fast as possible by removing cross database joins. To keep the data
up to date and accurate, the new data tables would also be updated when the member
logged in. It was not long before there were many tables being maintained this way.
The performance degradation began to creep up as the database updates were per-
formed serially. Each query updating the member’s last login timestamp needed to
finish before the next began. Ten queries that were considered performant, each fin-
ishing within 50ms, would add up to half a second in database updates alone. All of
these queries would have to finish prior to sending the authorization response and
redirect back to the user. In addition, any operational issues on a database server com-
pounded the problem. If one database server in the chain of servers started respond-
ing slowly or became unresponsive, members could no longer login into the site.
To decouple the user-facing login application from directly writing to the database.
I published messages to Message-oriented-middleware or a centralized message bro-
ker, which would distribute the message to consumer applications that handle the
required database writes. While I first experimented with several different message
brokers, ultimately I landed on RabbitMQ as my broker of choice.
After decoupling the login process from the required database updates, a new level of
freedom was discovered. Members were able to quickly login because we were no lon-
ger updating the database as part of the authentication process. Instead a member
login message was published with all of the information needed to update any data-
base, and consumer applications were written that updated each database table inde-
pendently (figure 1.3). This login message would not contain authentication for the
member, but instead, only the information needed to maintain the member’s last-
login status in our databases and applications. This allowed us to horizontally scale
database writes with more control. By controlling the number of consumer applica-
tions writing for a specific database server, we were able to throttle database writes for
servers that had started to strain under the load created by new site growth, while we
worked through their own unique scaling issues.
As I detail the advantages of a messaging based architecture, it is important to note
that they (they who?) could also impact the performance of systems like the login
architecture described. Any number of problems may impact publisher performance,
from networking issues to RabbitMQ throttling message publishers. When such events
happen, your application will see degraded performance. In addition to the horizon-
Login Application
C C C
DB DB DB
Figure 1.3 After: Using RabbitMQ, loosely coupled data is published to each database asynchronously
and independently, allowing the login application to proceed without waiting on any database writes.
tal scaling of consumers, it is wise to plan for horizontal scaling of message brokers to
allow for better message throughput and publisher performance.
Figure 1.4 When communicating with a database, a tightly coupled application must wait for the
database server to respond to continue processing.
Figure 1.5 A loosely coupled application allows the application that would have saved the data directly
in the database to publish the data to RabbitMQ, allowing for the asynchronous processing of data.
In this model, should a database need to be taken offline for maintenance, or should
the write workload become too heavy, you can throttle the consumer application or
stop it. Until the consumer is able to receive the message, the data will persist in the
queue. The ability to pause or throttle consumer application behavior is just one
advantage of using this type of architecture.
Figure 1.6 By using RabbitMQ, the publishing application does not need to be changed in order to
deliver the same data to both a new cloud-based service and the original database.
Figure 1.7 By leveraging RabbitMQ’s federation plugin, messages can be duplicated to perform
the same work in multiple data centers.
Figure 1.8 Bi-directional federation of data allows for the same data events to be received
processed in both data centers.
Licensed to Marocco Aurélien <[email protected]>
80 CHAPTER 1 Foundational RabbitMQ
Figure 1.9 When a publisher sends a message into RabbitMQ, it first goes to an exchange.
QUEUES
A queue is responsible for storing received messages and may contain configuration
information that defines what it is able to do with a message. Queues may hold mes-
sages in RAM only or it may persist them to disk prior to delivering messages from a
queue in first-in, first-out (FIFO) order.
BINDINGS
To define a relationship between queues and exchanges, the AMQ model defines a
binding. In RabbitMQ, bindings or binding-keys tell an exchange which queues to
deliver messages to. For some exchange types it will also instruct the exchange to filter
which messages it can deliver to a queue. When publishing a message to an exchange,
applications use a routing-key attribute. Sometimes this may be a queue name, at other
times it may be a string that semantically describes the message. When a message is
evaluated by an exchange to be routed to the appropriate queues, the message’s rout-
ing-key is evaluated against the binding-key (figure 1.10). In other words, the binding-
key is the glue that binds a queue to an exchange and the routing-key is the criteria
that is evaluated against it.
Figure 1.10 A queue is bound to an exchange, providing the information the exchange needs to route a
message to it.
In the simplest of scenarios, the routing key may be the queue name, though this
varies with each exchange type. In RabbitMQ, each exchange type is likely to treat
routing-keys in a different way; some exchanges invoke simple equality checks and
others use more complex pattern extractions from the routing-key. There is even an
exchange type that ignores the routing-key outright in favor of other information in
the message properties.
In addition to binding queues to exchanges, as defined in the AMQ model, Rab-
bitMQ extends the AMQP specification to allow exchanges to bind to other
exchanges. This feature creates a great deal of flexibility in creating different routing
patterns for messages. You will learn more about routing patterns available when you
use exchanges, and about exchange-to-exchange bindings in chapter 6, Common
Messaging Patterns.
1.5 Summary
RabbitMQ, as messaging-oriented-middleware, is an exciting technology that enables
operational flexibility that is otherwise difficult to achieve without the loosely coupled
application architecture it enables. By diving deep into RabbitMQ’s AMQP founda-
tion and behaviors, this book is a valuable reference, providing insight into how your
applications can leverage its robust and powerful features. In particular, you will soon
learn how to publish messages, and use the dynamic routing features in RabbitMQ to
selectively sip from the fire hose of data your application can send; data that once may
have been deeply buried in tightly coupled code and processes in your environment.
Whether you are an application developer or a high-level application architect, it is
advantageous to have deep level knowledge about how your applications can benefit
from RabbitMQ’s diverse functionality. Thus far, you have already learned the most
foundational concepts that comprise the Advanced Message Queuing Model. Expand-
ing on these concepts in Part 1 of this book, you will learn about the Advanced Mes-
sage Queuing Protocol and how it defines the core of RabbitMQ’s behavior.
Because this book will be hands-on with the goal of imparting the knowledge
required to use RabbitMQ, the most demanding of environments, you will start work-
ing with code in the next chapter. By learning “how to speak Rabbit” you will leverage
the fundamentals of the Advanced Message Queuing Protocol, writing code to send
and receive messages with RabbitMQ. To speak Rabbit, you will be using a Python
based library called rabbitpy, a library that was written specifically for the code exam-
ples in this book; I’ll introduce it to you in the next chapter. Even if you are an experi-
enced developer who has written applications that communicate with RabbitMQ, you
should at least browse through the next chapter to understand what is happening at
the protocol level when you are using RabbitMQ via the AMQP protocol.
What's inside
Understanding the AMQP model
Communicating via MQTT, Stomp, and HTTP
Valuable troubleshooting techniques
Integrating with Java technologies like Hadoop and Esper
Database integrations with PostgreSQL and Riak
Written for programmers with a basic understanding of messaging oriented systems
and RabbitMQ.
In this chapter we’ll present the first of two sets of case studies contributed by com-
panies that have used Netty extensively in their internal infrastructure. We hope
that these examples of how others have utilized the framework to solve real-world
problems will broaden your understanding of what you can accomplish with Netty.
NOTE The author or authors of each study were directly involved in the
project they discuss.
85
This is a case study on how we moved from a monolithic and sluggish LAMP1 appli-
cation to a modern, high-performance and horizontally distributed infrastructure,
implemented atop Netty.
1
An acronym for a typical application technology stack; originally Linux, Apache Web Server, MySQL, and
PHP.
response was received—because the file would still need to be uploaded to S3 and
have its thumbnails generated.
The larger the file, the longer the hiatus. For very large files the connection would
eventually time out waiting for the okay from the server. Back then Droplr could offer
uploads of only up to 32 MB per file because of this very problem.
There were two distinct approaches to cut down upload times:
Approach A, optimistic and apparently simpler (figure 14.1):
– Fully receive the file
– Save to the local filesystem and immediately return success to client
– Schedule an upload to S3 some time in the future
Post
Local
Put
Post
Put
the file to S3) will ultimately succeed, and the user could end up with a broken link
that might get posted on Twitter or sent to an important client. This is unacceptable,
even if it happens on one in every hundred thousand uploads.
Our current numbers show that we have an upload failure rate slightly below
0.01% (1 in every 10,000), the vast majority being connection timeouts between client
and server before the upload actually completes.
We could try to work around it by serving the file from the machine that received it
until it is finally pushed to S3, but this approach is in itself a can of worms:
If the machine fails before a batch of files is completely uploaded to S3, the files
would be forever lost.
There would be synchronization issues across the cluster (“Where is the file for
this drop?”).
Extra, complex logic would be required to deal with edge cases, and this keeps
creating more edge cases.
Thinking through all the pitfalls with every workaround, I quickly realized that it’s a
classic hydra problem—for each head you chop off, two more appear in its place.
THE SAFE BUT COMPLEX APPROACH
The other option required low-level control over the whole process. In essence, we
had to be able to
Open a connection to S3 while receiving the upload from the client.
Pipe data from the client connection to the S3 connection.
Buffer and throttle both connections:
– Buffering is required to keep a steady flow between both client-to-server and
server-to-S3 legs of the upload.
– Throttling is required to prevent explosive memory consumption in case the
server-to-S3 leg of the upload becomes slower than the client-to-server leg.
Cleanly roll everything back on both ends if things went wrong.
It seems conceptually simple, but it’s hardly something your average webserver can
offer. Especially when you consider that in order to throttle a TCP connection, you
need low-level access to its socket.
It also introduced a new challenge that would ultimately end up shaping our final
architecture: deferred thumbnail creation.
This meant that whichever technology stack the platform ended up being built
upon, it had to offer not only a few basic things like incredible performance and sta-
bility but also the flexibility to go bare metal (read: down to the bytes) if required.
@Override
public void channelIdle(ChannelHandlerContext ctx,
IdleStateEvent e) throws Exception {
// Shut down connection to client and roll everything back.
}
As explained previously in this book, you should never execute non-CPU-bound code
on Netty’s I/O threads—you’ll be stealing away precious resources from Netty and
thus affecting the server’s throughput.
For this reason, both the HttpRequest and HttpChunk may hand off the execution
to the request handler by switching over to a different thread. This happens when the
request handlers aren’t CPU-bound, whether because they access the database or per-
form logic that’s not confined to local memory or CPU.
When thread-switching occurs, it’s imperative that all the blocks of code execute in
serial fashion; otherwise we’d risk, for an upload, having HttpChunk n-1 being pro-
cessed after HttpChunk n and thus corrupting the body of the file. (We’d be swapping
how bytes were laid out in the uploaded file.) To cope with this, I created a custom
thread-pool executor that ensures all tasks sharing a common identifier will be exe-
cuted serially.
From here on, the data (requests and chunks) ventures out of the realms of Netty
and Droplr.
I’ll explain briefly how the request handlers are built for the sake of shedding
some light on the bridge between the RequestController—which lives in Netty-
land—and the handlers—Droplr-land. Who knows, maybe this will help you architect
your own server!
THE REQUEST HANDLERS
Request handlers provide Droplr’s functionality. They’re the endpoints behind URIs
such as /account or /drops. They’re the logic cores—the server’s interpreters of cli-
ents’ requests.
Request handler implementations are where the framework actually becomes
Droplr’s API server.
THE PARENT INTERFACE
Each request handler, whether directly or through a subclass hierarchy, is a realization
of the interface RequestHandler.
In its essence, the RequestHandler interface represents a stateless handler for
requests (instances of HttpRequest) and chunks (instances of HttpChunk). It’s an
extremely simple interface with a couple of methods to help the request controller
perform and/or decide how to perform its duties, such as:
Is the request handler stateful or stateless? Does it need to be cloned from a
prototype or can the prototype be used to handle the request?
Is the request handler CPU or non-CPU bound? Can it execute on Netty’s
worker threads or should it be executed in a separate thread pool?
Roll back current changes.
Clean up any used resources.
This interface is all the RequestController knows about actions. Through its very
clear and concise interface, the controller can interact with stateful and stateless, CPU-
bound and non-CPU-bound handlers (or combinations of these) in an isolated and
implementation-agnostic fashion.
HANDLER IMPLEMENTATIONS
The simplest realization of RequestHandler is AbstractRequestHandler, which repre-
sents the root of a subclass hierarchy that becomes ever more specific until it reaches
the actual handlers that provide all of Droplr’s functionality. Eventually it leads to the
stateful implementation SimpleHandler, which executes in a non-IO-worker thread
and is therefore not CPU-bound. SimpleHandler is ideal for quickly implementing
endpoints that do the typical tasks of reading in JSON, hitting the database, and then
writing out some JSON.
THE UPLOAD REQUEST HANDLER
The upload request handler is the crux of the whole Droplr API server. It was the action
that shaped the design of the webserver module—the frameworky part of the server—
and it’s by far the most complex and tuned piece of code in the whole stack.
During uploads, the server has dual behaviors:
On one side, it acts as a server for the API clients that are uploading the files.
On the other side, it acts as client to S3 to push the data it receives from the API
clients.
To act as a client, the server uses an HTTP client library that is also built with Netty.1
This asynchronous library exposes an interface that perfectly matches the needs of
the server. It begins executing an HTTP request and allows data to be fed to it as it
becomes available, and this greatly reduces the complexity of the client facade of the
upload request handler.
14.1.5 Performance
After the initial version of the server was complete, I ran a batch of performance tests.
The results were nothing short of mind blowing. After continuously increasing the
load in disbelief, I saw the new server peak at 10~12x faster uploads over the old LAMP
stack—a full order of magnitude faster—and it could handle over 1000x more concur-
rent uploads, for a total of nearly 10 k concurrent uploads (running on a single EC2
large instance).
The following factors contributed to this:
It was running in a tuned JVM.
It was running in a highly tuned custom stack, created specifically to address
this problem, instead of an all-purpose web framework.
The custom stack was built with Netty using NIO (selector-based model), which
meant it could scale to tens or even hundreds of thousands of concurrent con-
nections, unlike the one-process-per-client LAMP stack.
There was no longer the overhead of receiving a full file and then uploading it
to S3 in two separate phases. The file was now streamed directly to S3.
1
You can find the HTTP client library at https://github.com/brunodecarvalho/http-client.
Real-time updates are an integral part of the user experience in modern applications.
As users come to expect this behavior, more and more applications are pushing data
changes to users in real time. Real-time data synchronization is difficult to achieve
with the traditional three-tiered architecture, which requires developers to manage
their own ops, servers, and scaling. By maintaining real-time, bidirectional communi-
cation with the client, Firebase provides an immediately intuitive experience allowing
developers to synchronize application data across diverse clients in a few minutes—all
without any backend work, servers, ops, or scaling required.
Implementing this presented a difficult technical challenge, and Netty was the
optimal solution in building the underlying framework for all network communica-
tions in Firebase. This study will provide an overview of Firebase’s architecture, and
then examine three ways Firebase uses Netty to power its real-time synchronization
service:
Long polling
HTTP 1.1 keep-alive and pipelining
Control of SSL handler
Firebase servers take incoming data updates and immediately synchronize them to all
of the connected clients that have registered interest in the changed data. To enable
real-time notification of state changes, clients maintain an active connection to Fire-
base at all times. This connection may range from an abstraction over a single Netty
channel to an abstraction over multiple channels or even multiple, concurrent
abstractions if the client is in the middle of switching transport types.
Because clients can connect to Firebase in a variety of ways, it’s important to keep
the connection code modular. Netty’s Channel abstraction is a fantastic building block
for integrating new transports into Firebase. In addition, the pipeline-and-handler
pattern makes it simple to keep transport-specific details isolated and provide a com-
mon message stream abstraction to the application code. Similarly, this greatly simpli-
fies adding support for new protocols. Firebase added support for a binary transport
simply by adding a few new handlers to the pipeline. Netty’s speed, level of abstrac-
tion, and fine-grained control made it an excellent framework for implementing real-
time connections between the client and server.
mented for each request. In addition, each request includes metadata about the
number of messages in the payload. If a message spans multiple requests, the portion
of the message contained in this payload is included in the metadata.
The server maintains a ring buffer of incoming message segments and processes
them as soon as they’re complete and no incomplete messages are ahead of them.
Downstream is easier because the long-polling transport responds to an HTTP GET
request and doesn’t have the same restrictions on payload size. In this case, a serial
number is included and is incremented once for each response. The client can pro-
cess all messages in the list as long as it has received all responses up to the given serial
number. If it hasn’t, it buffers the list until it receives the outstanding responses.
CLOSE NOTIFICATIONS
The second property enforced in the long-polling transport is close notification. In
this case, having the server be aware that the transport has closed is significantly more
important than having the client recognize the close. The Firebase library used by cli-
ents queues up operations to be run when a disconnect occurs, and those operations
can have an impact on other still-connected clients. So it’s important to know when a
client has actually gone away. Implementing a server-initiated close is relatively simple
and can be achieved by responding to the next request with a special protocol-level
close message.
Implementing client-side close notifications is trickier. The same close notification
can be used, but there are two things that can cause this to fail: the user can close the
browser tab, or the network connection could disappear. The tab-closure case is han-
dled with an iframe that fires a request containing the close message on page unload.
The second case is dealt with via a server-side timeout. It’s important to pick your
timeout values carefully, because the server is unable to distinguish a slow network
from a disconnected client. That is to say, there’s no way for the server to know that a
request was actually delayed for a minute, rather than the client losing its network
connection. It’s important to choose an appropriate timeout that balances the cost of
false positives (closing transports for clients on slow networks) against how quickly the
application needs to be aware of disconnected clients.
Figure 14.4 demonstrates how the Firebase long-polling transport handles differ-
ent types of requests.
In this diagram, each long-poll request indicates different types of scenarios. Ini-
tially, the client sends a poll (poll 0) to the server. Some time later, the server receives
data from elsewhere in the system that is destined for this client, so it responds to poll
0 with the data. As soon as the poll returns, the client sends a new poll (poll 1),
because it currently has none outstanding. A short time later, the client needs to send
data to the server. Since it only has a single poll outstanding, it sends a new one (poll 2)
that includes the data to be delivered. Per the protocol, as soon as the server has two
simultaneous polls from the same client, it responds to the first one. In this case, the
server has no data available for the client, so it sends back an empty response. The cli-
ent also maintains a timeout and will send a second poll when it fires, even if it has no
Client Server
poll 0
poll 1
poll 2
(Co
data s ntaining
ent fr
the cli om
ent Data from client is sent
for processing
)
o data
n se 1 (N
Respo Figure 14.4 Long polling
additional data to send. This insulates the system from failures due to browsers timing
out slow requests.
tabs. Given long-polling requests, this is difficult and requires proper management of
a queue of HTTP requests. Long-polling requests can be interrupted, but proxied
requests can’t. Netty made serving multiple request types easy:
Static HTML pages —Cached content that can be returned with no processing;
examples include a single-page HTML app, robots.txt, and crossdomain.xml.
REST requests —Firebase supports traditional GET, POST, PUT, DELETE, PATCH, and
OPTIONS requests.
WebSocket —A bidirectional connection between a browser and a Firebase server
with its own framing protocol.
Long polling —These are similar to HTTP GET requests but are treated differently
by the application.
Proxied requests —Some requests can’t be handled by the server that receives
them. In that case, Firebase proxies the request to the correct server in its clus-
ter, so that end users don’t have to worry about where data is located. These are
like the REST requests, but the proxying server treats them differently.
Raw bytes over SSL—A simple TCP socket running Firebase’s own framing proto-
col and optimized handshaking.
Firebase uses Netty to set up its pipeline to decode an incoming request and then
reconfigure the remainder of the pipeline appropriately. In some cases, like WebSockets
and raw bytes, once a particular type of request has been assigned a channel, it will
stay that way for its entire duration. In other cases, like the various HTTP requests, the
assignment must be made on a per-message basis. The same channel could handle
REST requests, long-polling requests, and proxied requests.
tions at every layer of the protocol stack, and also allowed for very accurate billing,
throttling, and rate limiting, all of which had significant business implications.
Netty made it possible to intercept all inbound and outbound messages and to
count bytes with a small amount of Scala code.
As smartphone use grows across the globe at unprecedented rates, a number of ser-
vice providers have emerged to assist developers and marketers toward the end of
providing amazing end-user experiences. Unlike their feature phone predecessors,
smartphones crave IP connectivity and seek it across a number of channels (3G, 4G,
WiFi, WiMAX, and Bluetooth). As more and more of these devices access public net-
works via IP-based protocols, the challenges of scale, latency, and throughput become
more and more daunting for back-end service providers.
Thankfully, Netty is well suited to many of the concerns faced by this thundering
herd of always-connected mobile devices. This chapter will detail several practical appli-
cations of Netty in scaling a mobile developer and marketer platform, Urban Airship.
Public
network API
Core
services
Device
messaging 3rd party Figure 14.6 High-level
channel adapter mobile messaging platform
integration
1
Some mobile OSes allow a form of push notifications called local notifications that would not follow this
approach.
approach, an application will rely on a third party to deliver the message to the appli-
cation on behalf of a back-end service. At Urban Airship, both approaches to deliver-
ing push notifications are used, and both leverage Netty extensively.
1
For information on APNS: http://docs.aws.amazon.com/sns/latest/dg/mobile-push-apns.html,
http://bit.ly/189mmpG.
At face value the protocol seems straightforward, but there are nuances to success-
fully addressing all of the preceding concerns, in particular on the JVM:
The APNS specification dictates that certain payload values should be sent in
big-endian ordering (for example, token length).
Step 3 in the previous sequence requires one of two solutions. Because the JVM
will not allow reading from a closed socket even if data exists in the output buf-
fer, you have two options:
– After a write, perform a blocking read with a timeout on the socket. This has
multiple disadvantages:
– The amount of time to block waiting for an error is non-deterministic. An
error may occur in milliseconds or seconds.
– As socket objects can’t be shared across multiple threads, writes to the
socket must immediately block while waiting for errors. This has dramatic
implications for throughput. If a single message is delivered in a socket
write, no additional messages can go out on that socket until the read
timeout has occurred. When you’re delivering tens of millions of mes-
sages, a three-second delay between messages isn’t acceptable.
– Relying on a socket timeout is an expensive operation. It results in an
exception being thrown and several unnecessary system calls.
– Use asynchronous I/O. In this model, neither reads nor writes block. This
allows writers to continue sending messages to APNS while at the same time
allowing the OS to inform user code when data is ready to be read.
Netty makes addressing all of these concerns trivial while at the same time delivering
amazing throughput.
First, let’s see how Netty simplifies packing a binary APNS message with correct
endian ordering.
It’s worth noting how easy Netty makes negotiating an X.509 authenticated connec-
tion in conjunction with asynchronous I/O. In early prototypes of APNS code at Urban
Airship without Netty, negotiating an asynchronous X.509 authenticated connection
required over 80 lines of code and a thread pool simply to connect. Netty hides all the
complexity of the SSL handshake, the authentication, and most importantly the encryp-
tion of cleartext bytes to cipher text and the key renegotiation that comes along with
using SSL. These incredibly tedious, error prone, and poorly documented APIs in the
JDK are hidden behind three lines of Netty code.
At Urban Airship, Netty plays a role in all connectivity to numerous third-party
push notification services including APNS and Google’s GCM. In every case, Netty is
flexible enough to allow explicit control over exactly how integration takes place from
higher-level HTTP connectivity behavior down to basic socket-level settings such as
TCP keep-alive and socket buffer sizing.
Over time, Urban Airship learned several critical lessons as connections from mobile
devices continued to grow:
The diversity of mobile carriers can have a dramatic effect on device connectiv-
ity.
Many carriers don’t allow TCP keep-alive functionality. Given that, many carriers
will aggressively cull idle TCP sessions.
UDP isn’t a viable channel for messaging to mobile devices because many carri-
ers disallow it.
The overhead of SSLv3 is an acute pain for short-lived connections.
Given the challenges of mobile growth and the lessons learned by Urban Airship,
Netty was a natural fit for implementing a mobile messaging platform for reasons
highlighted in the following sections.
1
Note the distinction of a physical server in this case. Although virtualization offers many benefits, leading cloud
providers were regularly unable to accommodate more than 200,000–300,000 concurrent TCP connections
to a single virtual host. With connections at or above this scale, expect to use bare metal servers and expect to
pay close attention to the NIC (Network Interface Card) vendor.
14.4 Summary
This chapter aimed at providing insight into real-world use of Netty and how it has
helped companies to solve significant networking problems. It’s worth noting how in
all cases Netty was leveraged not only as a code framework, but also as an essential
component of development and architectural best practices.
In the next chapter we’ll present case studies contributed by Facebook and Twitter
describing open source projects that evolved from Netty-based code originally devel-
oped to address internal needs.
What's inside
Netty from the ground up
Asynchronous, event-driven programming
Implementing services using different protocols
Covers Netty 4.x
This book assumes readers are comfortable with Java and basic network architecture.
109
E K
EC2 (Elastic Compute Cloud) service kernel 63
defined 2 key pair for SSH
See also virtual servers creating 28–31
Elastic Compute Cloud service. See EC2 service
enterprise services 18 L
Ericsson Computer Science Laboratory 71
Erlang, RabbitMQ and 71 layering 41, 48–50
exchange, component of the message broker 80 leading master 63
type of 81 Linux
executors 64–65
key file permissions 30
Exited status 48
EXPOSE command 45
M
F Mac OS X
key file permissions 30
fault-tolerance
AWS use cases 8 MAINTAINER command 44
Firebase case study Management Console
HTTP 1.1 keep-alive 97–98 overview 19
long polling 95–97 signing in 26
overview 93–94, 100 masters
SSL handler control 98–100 overview 56, 63
first-in, first-out (FIFO) order 81 Memcached 107
frameworks 63, 65 MercadoLibre, e-commerce ecosystem 73
Free Tier 12 Mesos framework
FROM command 44 distributred architecture of 63–65
frameworks 65
G masters 63
slave 64
git 45 slaves 64–65
Google Cloud Platform 15–16 how it works 55–57
resource isolation 57
H resource offers 56
two-tier scheduling 56–57
hardware 3 when to use 59–60
HTTP 1.1 keep-alive 97–98 message broker 70–71
tradeoffs and introducing to an architecture 78
I messaging-based architectures, RabbitMQ and 69
messaging-oriented-middleware (MOM)
IaaS (infrastracture as a service) 4 defined 75
images operational complexity and 78
building 45–46 RabbitMQ as 69
overview 41 multitenancy 54, 57
infrastracture as a service. See IaaS
Internet, RabbitMQ and 70 N
inter-process communication (IPC) system 71
NASA 73
J NAT (Network Address Translation) 5
NioServerSocketChannelFactory 89
Java EE applications 5–6 Node.js image 44
V W-Z
virtual machines, comparing containers to 57–58 WebSocket
virtual machines. See VMs defined 98
Virtual Memory (VM) 72 Droplr case study 95
VMs (virtual machines) 6 Windows
VPN (Virtual Private Network) 5 SSH client on 30–31
WORKDIR command 45
writing, Dockerfile 44–45
ZooKeeper 63
Docker in Practice
by Ian Miell and Aidan Hobson Sayers
ISBN: 9781617292729
325 pages
$44.99
March 2016
Mesos in Action
by Roger Ignazio
ISBN: 9781617292927
325 pages
$44.99
April 2016
Netty in Action
by Norman Maurer and Marvin Allen Wolfthal
ISBN: 9781617291470
296 pages
$54.99
September 2015