Nosql Technologies: Performance Characteristics and Monitoring
Nosql Technologies: Performance Characteristics and Monitoring
NoSQL
Technologies:
Performance Characteristics
and Monitoring
Contents
Executive Summary ..................................................................................................................................................................... 3
Here’s a short summary that might help you make your decision: ............................................................... 4
Summary.......................................................................................................................................................................................... 17
NoSQL RDBMS
» Storage should be able to deal with very » Storage is expected to be high-load, too, but
high load it mainly consists of read operations
» You do many write operations on the » You want performance over a more
storage sophisticated data structure
» You want storage that is horizontally » You need powerful SQL query language
scalable
» Simplicity is good, as in a very simple query
language (without joins)
Fundamentally, nothing, but they do have limitations that constrain what can be achieved in certain
scenarios. Consider these three problems with RDBMSs:
» RDBMSs use a table-based normalization approach to data, which can be a limited model. Certain
data structures cannot be represented without altering the data, programs, or both.
» They allow versioning or activities like: Create, Read, Update and Delete. For databases, updates
should never be allowed, because they destroy information. For at least some applications it is better
that when data changes the database should just add another record and note the previous value for
that record.
» Performance falls off as RDBMSs normalize data. The reason: Normalization requires more tables, table
joins, keys and indexes and thus more internal database operations are required to parse and execute
queries. Pretty soon, the database starts to grow into the terabytes, and with the overhead of the
RDBMS model that can be when things slow down.
1. Key-values Stores. The underlying principle here is the use of a hash table, in which there is a unique
key and a pointer to a particular item of data. The Key/value model is the simplest and easiest to
implement but it is inefficient when you are only interested in querying or updating part of a value,
among other disadvantages.
2. Column Family Stores. These were created to store and process very large amounts of data
distributed over many machines. There are still keys but they point to multiple columns, with
columns being arranged by column family.
3. Document Databases. These were inspired by Lotus Notes and are similar to key-value stores. The
model is basically versioned documents that are collections of other key-value collections. The semi-
structured documents are stored in formats like JSON. Document databases are essentially the next
level of Key/value, allowing nested values associated with each key. Document databases support
querying more efficiently.
4. Graph Databases. Instead of tables of rows and columns and the rigid structure of SQL, a flexible
graph model is used which, again, can scale across multiple machines.
Strengths: Graph algorithms e.g. shortest path, connectedness, n degree relationships, etc.
Weaknesses: Has to traverse the entire graph to achieve a definitive answer. Not easy to cluster.
» Logging/Archiving. Log-mining tools are handy because they can access logs across servers, relate
them and analyze them.
» Social Computing Insight. Many enterprises today have provided their users with the ability to do
social computing through message forums, blogs etc.
» External Data Feed Integration. Many companies need to integrate data coming from business
partners. Even if the two parties conduct numerous discussions and negotiations, enterprises have
little control over the format of the data coming to them. Also, there are many situations where those
formats change very frequently – based on the changes in the business needs of partners.
» Enterprise Content Management Service. Content Management is now used across companies’
different functional groups, for instance, HR or Sales. The challenge is bringing together different
groups using different meta data structures in a common content management service.
» Real-time stats/analytics. Sometimes it is necessary to use the database as a way to track real-time
performance metrics for websites (page views, unique visits, etc.) Tools like Google Analytics are
great but not real-time — sometimes it is useful to build a secondary system that provides basic real-
time stats. Other alternatives, such as 24/7 monitoring of web traffic, are a good way to go, too.
Apache Cassandra
Apache Cassandra - like so many other NoSQL tools – is an open-source distributed database system. It was
originally created at Facebook in 2008 – but with much input from other sources, notably Google (BigTable)
and Amazon (Dynamo).
Cassandra has a very flexible schema. As originally described in the Google “BigTable” paper, Cassandra offers
the organization of a traditional RDBMS table layout combined with the flexibility and power of no stringent
structure requirements. This allows you to store your data as you need to – without a performance penalty for
changes, which can be important as your storage needs evolve over time.
Cassandra is based on a key-value model. A database consists of column families. A column family is a set of
key-value pairs. Drawing an analogy with relational databases, you can think about column family as table,
and separately, a key-value pair as a record in a table. A table in Cassandra is a distributed multi-dimensional
map indexed by a key. Cassandra can handle maps with four or five dimensions:
Cassandra has the ability to scale its read/write throughput linearly with machine count based on its unique
architecture, a very compelling benefit when data volumes are expected to grow rapidly. A collection of
Cassandra nodes together is called a ring and in a ring each Cassandra node does exactly the same thing. To
make all this scaling happen seamlessly, Cassandra nodes employ a gossip protocol which enables the nodes
to talk to each other and figure out where in the ring to send reads and writes.
Like many highly scalable database systems, Cassandra uses data partitioning to achieve some of its incredible
speed characteristics. Cassandra’s default behavior is to partition data randomly across the ring. Incredibly,
even with your data partitioned, you can add and remove new nodes from your Cassandra ring without
compromising availability or consistency!
So with a multitude of nodes in your Cassandra ring how do you monitor this massive system? Do you even
need to monitor it? Can it be monitored? Yes, yes, and yes.
Cassandra comes with a very nifty command line tool called nodetool. Nodetool can help you monitor your
ring extremely well by plugging into JMX, the Java Management Extensions. This enables you to gather very
detailed information about your ring. You can get details about individual column families or about the entire
ring itself!
bin/nodetool-host10.176.0.146 ring
Address Status Load Range Ring
10.176.0.146 Up 459.27 MB 75603446264197340449435394672681112420 |<--|
10.176.1.161 Up 382.53 MB 137462771597874153173150284137310597304 | |
10.176.1.162 Up 511.34 MB 63538518574533451921556363897953848387 |-->|
This provides the basis that can be used to monitor Cassandra but it needs a supporting framework to make it
useful as a production tool. Fortunately, we have done a little bit of the work for you and created an open
source Monitis-Cassandra project that can help you monitor your Cassandra clusters in style. This captures the
data from nodetool output and uploads It to Monitis’ cloud-based platform, where it is parsed and stored for
later display and analysis. Set the Monitis-Cassandra tool to run using cron or any other scheduling agent and
you have yourself a fully functioning monitoring system for your Cassandra cluster! Check back in to the
Monitis Web site to see all your fresh Cassandra metrics rolling in.
Using this combination of tools you can see how many nodes are currently running and how many are
currently active. This is important, and you might want to set up a Monitis alert to let you know whenever
one of your nodes goes down. (Keep in mind that Cassandra should still be running fine with a single node
failure, but it is definitely important to stay in the know.)
Since Cassandra is a Java application, you can also get information about the size of the heap space. You
might want to set an alert to go off if your Cassandra node is running out of memory because once it does it
will start paging to disk, which will impact performance dramatically and lose some of the benefit associated
with NoSQL in the first place.
Finally, there are some really fine-grained per column family metrics that come in through this open source
tool. You can get read and write latency for each of your column families – this is great! If your read latency is
moving above 200ms, then set an alert in Monitis to stay on top of the problem. You can set alerts on the
write latency as well. Fine-grained monitoring of individual Column Families is a great way to make sure that
your cluster is operating at peak performance.
MongoDB combines the best features of key-value stores, document databases, object databases, and
relational database management systems (RDBMS). What that means is that MongoDB shards automatically
(as with a key-value store), allows JSON-based dynamic schema documents and offers a rich query language
in the manner of an RDBMS. MongoDB also offers a MapReduce implementation feature.
When you use Mongo, you need to understand that it is extremely fast in its default configuration but that
this comes at a price – in its default configuration MongoDB is not very concerned if you lose data or if
replication fails or if your mongo instance crashes and loses an hour or so worth of data. This can be
overcome by altering configurations but would be a major cause for concern if you are storing your users data
and they expect their comments, financial reports or medical records to be in your system.
Monitoring MongoDB is very similar to monitoring a normal relational database. On any given machine you
need to monitor things like how much CPU it is taking up, how much RAM, how fast are the disks etc. If
Mongo starts running out of any of your core compute resources then there will inevitably be problems!
Mongo was built to scale. The current standard installation of Mongo supports Master/Slave Replication,
Replica Sets, and Auto-Sharding right out of the box. You can easily get into complex monitoring situations
when you need to monitor not just one database but an entire cluster.
Basically, we have two sets of statistics we would like to collect from our Mongo instances. Firstly there are
the basic computer stats that you need to collect for any machine in your fleet and secondly we want to
collect all these great DB stats from the Mongo HTTP Console and have a place to store and view them. That’s
where Monitis comes into play!
To get started download the Monitis Mongo code onto your mongo server:
Now you can set up the custom Mongo monitors. This will setup the following custom monitors in Monitis:
Now that you have your monitors setup, you can send data by running:
python send_data.py.
Schedule the code to run under Cron or any other scheduling agent and you’re done. Pretty simple and now
you can watch all you monitoring data stream in.
BerkeleyDB – the grand-daddy of NoSQL databases – started out as a project at UC Berkeley aimed at
providing a simple but powerful database management system for BSD Unix. The project was so successful
that it was soon spun off to create Sleepycat Software, which was aquired by Oracle in 2006.
BerkeleyDB supports transactions through a mechanism called write-ahead logging. In essence, this means
that changes are written to a log file before the database file is modified. At each checkpoint, the pending
transactions are “flushed” to the database file; if the write fails for some reason, the transaction is rolled back.
Nothing comes free though – in a multi-process or multi-threaded transactional application, the developer
has the responsibility to issue checkpoints periodically by using a dedicated thread or by running the
thedb_checkpoint utility.
When it comes to performance, BerkeleyDB’s cache size is the single most important configuration parameter.
In order to achieve maximum performance, the data most frequently accessed by your application should be
cached so that read and write requests do not trigger too much I/O activity.
BerkeleyDB databases are grouped together in application environments - directories which group data and
log files along with common settings used by a particular application. The database cache is global for all
databases in a particular environment and needs to be allocated at the time that environment is created.
Most of the time this is done programmatically. The issue of course is that, the optimal cache size is directly
related to the size of the data set, which is not always known at application design time. Many BerkeleyDB
applications end up storing considerably more data than originally envisioned. Even worse, some applications
do not explicitly specify a cache size at all, so the default cache size of 256kB is allocated – which is far too low
for many applications. Such applications suffer from degraded performance as they accumulate more data;
their performance can be significantly improved by increasing the cache size. Luckily, most of the time this
can be achieved without any code changes by creating a configuration file in your BerkeleyDB application
environment.
The db_stat tool can help you determine the size of the BerkeleyDB cache and the cache hit rate (the number
of pages retrieved from the cache as opposed to loaded from disk). Run the command in the directory
containing the BerkeleyDB environment (or use the -h switch to specify a directory):
$db_stat -m
264KB 48B Total cache size
...
100004 Requested pages found in the cache (42%)
Let’s see how Monitis can help you keep tabs on the cache hit rate and other critical database metrics. We use
a custom monitor created by a perl script,monitor_bdb.pl, which can be downloaded from our repository on
GitHub. To familiarize yourself with its parameters, run ‘moniror_dbd --help‘. As always, we encourage you to
look at the source code to understand its inner workings.
Once the monitor window opens, switch to the Line Chart view and select Cache Hit Rate from the selection
box on the left to display the graph:
The notification rule will be saved and Monitis will notify you if the cache hit rate falls below your chosen
threshold.
HBase is really a clone (or a very close relative) of Google’s Bigtable and was originally created for use with
Hadoop. In fact, HBase is a subproject of the Apache Hadoop project. HBase offers database capabilities for
Hadoop, which means you can use it as a source or sink for MapReduce jobs. HBase is a column-oriented
database, and it is built to provide low latency requests on top of Hadoop HDFS. Unlike some other columnar
databases that provide eventual consistency, HBase is very consistent
HBase supports unstructured and partially structured data. To do so, data is organized into column families.
You address an individual record, called a “cell” in HBase, with a combination of row key, column family, cell
qualifier, and time stamp. As opposed to RDBMS (relational database management systems), in which you
must define your table well in advance, with HBase you can simply name a column family and then allow the
cell qualifiers to be determined at runtime. This lets you be very flexible and supports an agile approach to
development.
Clustering in HBase is less transparent than in some other NoSQL tools and uses several kinds of servers. HDFS
needs at least one namenode and several datanodes, plus, HBase needs a ZooKeeper cluster, a master and
several region servers. Requests must be made to the master(s). On the HDFS level, existing data are not
sharded automatically. However, new data is sharded. On the HBase level, data is divided into regions that are
sharded automatically across region servers.
.
If you choose, HBase will allow you to use Google’s Protobuf (Protocol Buffer) API as an alternative to XML.
Protobuf is a very efficient way of serializing data. It has a noticeable advantage in compactness, the same
data will be two to three times smaller than XML, and in being 20 to 100 times faster to parse than XML.
Interestingly, Facebook recently decided not to use Cassandra in future development and has adopted Hbase
because this new Messaging model requires more flexible replication – as HBase provides.
The JMX Monitors window will open. At the bottom of it, you will find a link to download the JMX Agent:
The .war file should be deployed in a standard JEE servlet container. While I recommend Tomcat due to its
small footprint and ease of use, there are other options. If you already use an application server such as JBoss
or WebSphere, you can deploy the JMX agent in it. While Tomcat offers many ways to deploy a .war file, the
easiest one is simply to copy the .war file to Tomcat's deploy folder. If Tomcat is already running, you don't
need to restart it - it will pick up and deploy the new .war file automatically.
Another consideration is which machine to deploy the monitor on. The HBase master would be the natural
choice but your decision is going to be influenced by your specific network topology and corporate
standards. In any case, the JMX agent should be able to access the HBase master on TCP port 10101 and
optionally the Region Servers (data nodes) on port 10102. Additionally, Tomcat must be accessible on port
8080 (default).
Enter your Monitis credentials - the same ones you use to login to your account on monitis.com - and click
Login.
Once it logs you in, the JMX Agent will prompt you to enter an Agent Name.
The Agent Name is used to identify the JMX Agent instance uniquely within Monitis. The metadata about the
monitors is associated with your account on monitis.com and the JMX agent will automatically download and
run any existing monitors previously defined for this agent name. For this reason, you want to choose a
unique name for each JMX agent deployment. Once you enter a meaningful name and click Save, you should
see the JMX Parameters page:
Make sure you enter the correct JMX port number and credentials you configured in HBase and
click Submit to go to the next page:
Select the hadoop domain from the drop down - HBase's MBeans live there for historical reasons. (You may
also want to explore other domains - such as java.lang, which provides important information on JVM's
internals). Within the hadoop domain, select the HBase service -> RPC Statistics. The next screen shows an
impressive number of metrics:
Under Monitor Name enter something meaningful. This is how your monitor will appear on
monitis.com. Check interval is in minutes. Select the following attributes:
» getNumOps
» getAvgTime
» getMinTime
» getMaxTime
» putNumOps
» putAvgTime
» putMinTime
» putMaxTime
While most attribute names are self-explanatory, the MBean does not provide a meaningful description for
the attributes, so feel free to examine the JMX section of the HBase book. Once you have selected all the
metrics you are interested in, click on the Add Monitor button at the bottom of the page.
We are now ready to log on to Monitis and examine the data collected by our newly created monitor. If you
are just logging on, Monitis will prompt you to add the new monitor, otherwise go to Monitors -> Manage
Monitors -> JMX Monitors to open the familiar JMX Monitors screen:
That's it! As with any monitor, you can choose between multiple views and define notifications for your HBase
performance metrics:
About Monitis
Monitis is the leading provider of Cloud-based Application Performance Management & Monitoring solutions
for System Admins and Web Developers. Over 80,000 users worldwide have chosen Monitis to increase
uptime and user experience of their services and products.
Monitis’ core product offerings include website monitoring, website full page load testing, transaction
monitoring, application and database monitoring, cloud resource monitoring, and server and internal
network monitoring. What makes Monitis’ software different is how fast it is to deploy, its flexible pricing and
feature-rich technology that provides a comprehensive single-pane view of on-premise and off-premise
infrastructure and applications.
The repository is open for community contributions; the code is provided as open source and developers are
free to modify and extend it as they wish. Contributors have already provided monitoring scripts for:
» Popular SQL and NoSQL databases such as Cassandra, MongoDB, Berkley DB and HBase
» Cloud services such as Amazon EC2 and Azure
» Web servers such as Apache, Nginx and IIS
» Microsoft products such as SharePoint, MS Exchange and universal WMI monitor
» Important open source products including memcached and Node.js
» Virtual platforms like VMware, XEN etc.
The Monitis Exchange also hosts open source SDKs for Java, PHP, Perl, Python, Ruby which use the Monitis
open API (http://monitis.com/api/api.html) to create custom monitors.
Content by Monitis is licensed under a Creative Commons Attribution 3.0 Unported License [http://creativecommons.org/licenses/by/3.0/].