Amazon Aurora Mysql Database Administrator'S Handbook: Connection Management
Amazon Aurora Mysql Database Administrator'S Handbook: Connection Management
Database Administrator’s
Handbook
Connection Management
March 2019
Notices
Customers are responsible for making their own independent assessment of the
information in this document. This document: (a) is for informational purposes only, (b)
represents AWS’s current product offerings and practices, which are subject to change
without notice, and (c) does not create any commitments or assurances from AWS and
its affiliates, suppliers or licensors. AWS’s products or services are provided “as is”
without warranties, representations, or conditions of any kind, whether express or
implied. AWS’s responsibilities and liabilities to its customers are controlled by AWS
agreements, and this document is not part of, nor does it modify, any agreement
between AWS and its customers.
© 2019 Amazon Web Services, Inc. or its affiliates. All rights reserved.
Contents
Template Requirements ......................................................................................................4
File Names & Properties ..................................................................................................4
Introduction ..........................................................................................................................1
Styles & Page Breaks ..........................................................................................................4
Heading 1.............................................................................................................................4
Heading 2 .........................................................................................................................4
Lists & Procedures ..............................................................................................................4
Notes & Offset Text .............................................................................................................4
Inline Character Styles ........................................................................................................4
Table of Contents ................................................................................................................4
Graphics, Figures, and Captions ........................................................................................4
Tables ..................................................................................................................................4
Citations ...............................................................................................................................4
Examples ..........................................................................................................................4
Quotations ........................................................................................................................4
Code.....................................................................................................................................4
Conclusion ...........................................................................................................................2
Contributors .......................................................................................................................14
Further Reading .................................................................................................................15
Document Revisions..........................................................................................................15
Abstract
This paper outlines the best practices for managing database connections, setting
server connection parameters, and configuring client programs, drivers, and connectors.
It’s a recommended read for Amazon Aurora MySQL Database Administrators (DBAs)
and application developers.
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
Introduction
Amazon Aurora MySQL (Aurora MySQL) is a managed relational database engine,
wire-compatible with MySQL 5.6 and 5.7. Most of the drivers, connectors, and tools that
you currently use with MySQL can be used with Aurora MySQL with little or no change.
Aurora MySQL database (DB) clusters provide advanced features such as:
Page 1
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
DNS Endpoints
An Aurora DB cluster consists of one or more instances and a cluster volume that
manages the data for those instances. There are two types of instances:
• Primary instance – Supports read and write statements. Currently, there can be
one primary instance per DB cluster.
Aurora supports the following types of Domain Name System (DNS) endpoints:
• Reader endpoint – Includes all Aurora Replicas in the DB cluster under a single
DNS CNAME. You can use the reader endpoint to implement DNS round-robin
load balancing for read-only connections.
• Instance endpoint – Each instance in the DB cluster has its own individual
endpoint. You can use this endpoint to connect directly to a specific instance.
• Relatively high memory use when there is a large number of user connections,
even if the connections are completely idle
Page 2
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
• Higher internal server contention and context switching overhead when working
with thousands of user connections
Aurora MySQL supports a thread pool approach that addresses these issues. You can
characterize the thread pool approach as follows:
• The thread pool automatically scales itself. The Aurora MySQL database
process continuously monitors its thread pool state and launches new workers
or destroys existing ones as needed. This is transparent to the user and doesn’t
need any manual configuration.
The following is a network packet trace for a MySQL connection handshake taking
place between a client and a MySQL-compatible server located in the same Availability
Zone:
Page 3
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
As you can see, even the simple act of opening and closing a single connection
involves an exchange of several network packets. The connection overhead becomes
more pronounced when you consider SQL statements issued by drivers as part of
connection setup (for example, SET variable_name = value commands used to set
session-level configuration). Server-side thread pooling doesn’t eliminate this type of
overhead.
Common Misconceptions
The following are common misconceptions for database connection management.
If the server uses connection pooling, you don’t need a pool on the application
side. As explained previously, this isn’t true for workloads where connections are
opened and torn down very frequently, and clients execute relatively few statements per
connection.
You might not need a connection pool if your connections are long lived. This means
that connection activity time is much longer than the time required to open and close the
connection. You can run a packet trace with tcpdump and see how many packets you
need to open/close connections versus how many packets you need to run your queries
within those connections. Even if the connections are long lived, you can still benefit
from using a connection pool to protect the database against connection surges, that is,
large bursts of new connection attempts.
Idle connections don’t use memory. This isn’t true because the operating system and
the database process both allocate an in-memory descriptor for each user connection.
What is typically true is that Aurora MySQL uses less memory than MySQL Community
Edition to maintain the same number of connections. However, memory usage for idle
connections is still not zero, even with Aurora MySQL.
Page 4
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
The general best practice is to avoid opening significantly more connections than you
need.
Downtime depends entirely on database stability and database features. This isn’t
true because the application design and configuration play an important role in
determining how fast user traffic can recover following a database event. For more
details, see the next section, “Best Practices.”
Best Practices
The following are best practices for managing database connections and configuring
connection drivers and pools.
+-------------------+--------+-----------------------------+
| server_id | role | replica_lag_in_milliseconds |
+-------------------+--------+-----------------------------+
| aurora-node-usw2a | writer | 0 |
| aurora-node-usw2b | reader | 19.253999710083008 |
+-------------------+--------+-----------------------------+
2 rows in set (0.00 sec)
Notice that the table contains cluster-wide metadata. You can query the table on any
instance in the DB cluster.
Page 5
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
For the purpose of this whitepaper, a smart driver is a database driver or connector with
the ability to read DB cluster topology from the metadata table. It can route new
connections to individual instance endpoints without relying on high-level cluster
endpoints. A smart driver is also typically capable of load balancing read-only
connections across the available Aurora Replicas in a round-robin fashion.
If you’re using a smart driver, the recommendations listed in the following sections still
apply. A smart driver can automate and abstract certain layers of database connectivity.
However, it doesn’t automatically configure itself with optimal settings, or automatically
make the application resilient to failures. For example, when using a smart driver, you
still need to ensure that the connection validation and recycling functions are configured
correctly, there’s no excessive DNS caching in the underlying system and network
layers, transactions are managed correctly, and so on.
It’s a good idea to evaluate the use of smart drivers in your setup. Note that if a third-
party driver contains Aurora MySQL-specific functionality, it doesn’t mean that it has
been officially tested, validated, or certified by AWS. Also note that due to the advanced
built-in features and higher overall complexity, smart drivers are likely to receive
updates and bug fixes more frequently than traditional (barebones) drivers. You should
regularly review the driver’s release notes and use the latest available version whenever
possible.
DNS Caching
Unless you use a smart database driver, you depend on DNS record updates and DNS
propagation for failovers, instance scaling, and load balancing across Aurora Replicas.
Currently, Aurora DNS zones use a short Time-To-Live (TTL) of 5 seconds. Ensure that
your network and client configurations don’t further increase the DNS cache TTL.
Remember that DNS caching can occur anywhere from your network layer, through the
operating system, to the application container. For example, Java virtual machines
(JVMs) are notorious for caching DNS indefinitely unless configured otherwise.
Page 6
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
Here are some examples of issues that can occur if you don’t follow DNS caching best
practices:
If you can’t rely on client applications (or interactive clients) to close idle connections,
use the server’s wait_timeout and interactive_timeout parameters to configure
idle connection timeout. The default timeout value is fairly high at 28,800 seconds (8
hours). You should tune it down to a value that’s acceptable in your environment. See
the MySQL Reference Manual for details.3
Consider using connection pooling to protect the database against connection surges.
Also consider connection pooling if the application opens large numbers of connections
(for example, thousands or more per second) and the connections are short lived, that
is, the time required for connection setup and teardown is significant compared to the
total connection lifetime. If your development language/framework doesn’t support
connection pooling, you can use a connection proxy instead. ProxySQL, MaxScale, and
ScaleArc are examples of third-party proxies compatible with the MySQL protocol. See
Connection Scaling for more notes on connection pools versus proxies.
Page 7
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
• Check and validate connection health when the connection is borrowed from the
pool. The validation query can be as simple as SELECT 1. However, in Aurora
you can also leverage connection checks that return a different value depending
on whether the instance is a primary instance (read/write) or an Aurora Replica
(read-only). For example, you can use the @@innodb_read_only variable to
determine the instance role. If the variable value is TRUE, you're on an Aurora
Replica.
• Check and validate connections periodically even when they're not borrowed. It
helps detect and clean up broken or unhealthy connections before an application
thread attempts to use them.
Connection Scaling
The most common technique for scaling web service capacity is to add or remove
application servers (instances) in response to changes in user traffic. Each application
server can use a database connection pool.
This approach causes the total number of database connections to grow proportionally
with the number of application instances. For example, 20 application servers
configured with 200 database connections each would require a total of 4,000 database
connections. If the application pool scales up to 200 instances (for example, during
peak hours), the total connection count will reach 40,000. Under a typical web
application workload, most of these connections are likely idle. In extreme cases, this
can limit database scalability: idle connections do take server resources, and you’re
opening significantly more of them than you need. Also, the total number of connections
is not easy to control because it’s not something you configure directly, but rather
depends on the number of application servers.
Page 8
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
• Introduce a connection proxy between the database and the application. On one
side, the proxy connects to the database with a fixed number of connections. On
the other side, the proxy accepts application connections and can provide
additional features such as query caching, connection buffering, query
rewriting/routing, and load balancing. ProxySQL, MaxScale, and ScaleArc are
examples of third-party proxies compatible with the MySQL protocol. For even
greater scalability and availability, you can use multiple proxy instances behind a
single DNS endpoint.
With autocommit disabled, the connection is always in transaction. You can commit or
roll back the current transaction, at which point the server immediately opens a new
one.
Recommendations:
• Always run with autocommit mode enabled. Set the autocommit parameter to 1
on the database side (which is the default) and on the application side (which
might not be the default).
Page 9
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
Note that these recommendations are not specific to Aurora MySQL. They apply to
MySQL and other databases that use the InnoDB storage engine.
• You can obtain the metadata of currently running transactions from the
INFORMATION_SCHEMA.INNODB_TRX table. The TRX_STARTED column contains
the transaction start time, and you can use it to calculate transaction age. A
transaction is worth investigating if it’s been running for several minutes or more.
See the MySQL Reference Manual for details about the table.5
• You can read the size of the garbage collection backlog from the InnoDB’s
trx_rseg_history_len counter in the
INFORMATION_SCHEMA.INNODB_METRICS table. See the MySQL Reference
Manual for details about the table.6 The larger the counter value is, the more
severe the impact might be in terms of query performance, CPU usage, and
storage consumption. Values in the range of tens of thousands indicate that the
garbage collection is somewhat delayed. Values in the range of millions or tens
of millions might be dangerous and should be investigated.
NOTE: In Aurora, all DB instances use the same storage volume, which
means that the garbage collection is cluster-wide and not specific to each
instance. Consequently, a runaway transaction on one instance can
impact all instances. Therefore, you should monitor long transactions on
all DB instances.
Connection Handshakes
A lot of work can happen behind the scenes when an application connector or a
graphical user interface (GUI) tool opens a new database session. Drivers and client
tools commonly execute series of statements to set up session configuration (for
example, SET SESSION variable = value). This increases the cost of creating new
connections and delays when your application can start issuing queries.
Page 10
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
The cost of connection handshakes becomes even more important if your applications
are very sensitive to latency. OLTP or Key-Value workloads that expect single-digit
millisecond latency can be visibly impacted if each connection is expensive to open. For
example, if the driver executes six statements to set up a connection and each
statement takes just one millisecond to execute, your application will be delayed by six
milliseconds before it issues its first query.
Recommendations:
• Use the Aurora MySQL Advanced Audit, the General Query Log, or network-
level packet traces (for example, with tcpdump) to obtain a record of statements
executed during a connection handshake. Whether or not you’re experiencing
connection or latency issues, you should be familiar with the internal operations
of your database driver.
• For each handshake statement, you should be able to explain its purpose and
describe its impact on queries you'll subsequently execute on that connection.
• Each handshake statement requires at least one network roundtrip and will
contribute to higher overall session latency. If the number of handshake
statements appears to be significant relative to the number of statements doing
actual work, determine if you can disable any of the handshake statements.
Consider using connection pooling to reduce the number of connection
handshakes.
DNS load balancing works at the connection level (not the individual query level). You
must keep resolving the endpoint without caching DNS to get a different instance IP on
each resolution. If you only resolve the endpoint once and then keep the connection in
your pool, every query on that connection goes to the same instance. If you cache DNS,
you receive the same instance IP each time you resolve the endpoint.
If you don’t follow best practices, these are examples of issues that can occur:
• Unequal use of Aurora Replicas, for example, one of the Aurora Replicas is
receiving most or all of the traffic while the other Aurora Replicas sit idle.
Page 11
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
• After you add or scale an Aurora Replica, it doesn’t receive traffic or it begins to
receive traffic after an unexpectedly long delay.
• After you remove an Aurora Replica, applications continue to send traffic to that
instance.
For more information, see earlier sections about DNS Endpoints and DNS Caching.
The only scalable way of addressing this challenge is to assume that issues/changes
will occur and design your applications accordingly.
Examples:
• If Aurora MySQL detects that the primary instance has failed, it can promote a
new primary instance and fail over to it, which typically happens within 30
seconds. Your application should be designed to recognize the change quickly
and without manual intervention.
• If you remove instances from a DB cluster, your application should not try to
connect to them.
Test your applications extensively and prepare a list of assumptions about how the
application should react to database events. Then, experimentally validate the
assumptions.
If you don’t follow best practices, database events (for example, failovers, scaling,
software upgrades) might result in longer than expected downtime. For example, you
might notice that a failover took 30 seconds (per the DB cluster’s Event Notifications)
but the application remained down for much longer.
Page 12
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
Server Configuration
There are two major server configuration variables worth mentioning in the context of
this whitepaper: max_connections and max_connect_errors.
If you also enabled performance_schema, be extra careful with the setting. The
Performance Schema memory structures are sized automatically based on server
configuration variables, including max_connections. The higher you set the variable,
the more memory Performance Schema uses. In extreme cases, this can lead to out-of-
memory issues on smaller instance types.
Page 13
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
A common (but incorrect) practice is to set the parameter to a very high value to avoid
client connectivity issues. This practice isn’t recommended because it:
• Can hide real threats, for example, someone actively trying to break into the
server.
• Max_connect_errors variable8
• Host_cache table11
Conclusion
Understanding and implementing connection management best practices is critical to
achieve scalability, reduce downtime, and ensure smooth integration between the
application and database layers. You can apply most of the recommendations provided
in this whitepaper with little to no engineering effort.
The guidance provided in this whitepaper should help you introduce improvements in
your current and future application deployments using Aurora MySQL DB clusters.
Contributors
Contributors to this document include:
Page 14
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
Further Reading
For additional information, see:
Document Revisions
Date Description
March 2019 Minor content updates to the following topics: Introduction, DNS
Endpoints, and Server Configuration.
Notes
1 http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Aurora.Overview.html
2 https://mariadb.com/kb/en/the-mariadb-library/failover-and-high-availability-with-
mariadb-connector-j/#specifics-for-amazon-aurora
3 https://dev.mysql.com/doc/refman/5.6/en/server-system-
variables.html#sysvar_wait_timeout
4 https://dev.mysql.com/doc/refman/5.6/en/innodb-autocommit-commit-rollback.html
5 https://dev.mysql.com/doc/refman/5.6/en/innodb-trx-table.html
6 https://dev.mysql.com/doc/refman/5.6/en/innodb-metrics-table.html
7 https://dev.mysql.com/doc/refman/5.6/en/server-system-
variables.html#sysvar_max_connections
8
https://dev.mysql.com/doc/refman/5.6/en/server-system-
variables.html#sysvar_max_connect_errors
9 https://dev.mysql.com/doc/refman/5.6/en/blocked-host.html
Page 15
Amazon Web Services Amazon Aurora MySQL Database Administrator’s Handbook
10https://dev.mysql.com/doc/refman/5.6/en/server-status-
variables.html#statvar_Aborted_connects
11 https://dev.mysql.com/doc/refman/5.6/en/host-cache-table.html
12 http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/CHAP_Aurora.html
13 https://dev.mysql.com/doc/refman/5.6/en/communication-errors.html
Page 16