0% found this document useful (0 votes)

46 views18 pages

Nosql Technologies: Performance Characteristics and Monitoring

This document discusses NoSQL databases and provides an overview of their characteristics and use cases compared to traditional SQL databases. It describes four categories of NoSQL databases - key-value stores, column family stores, document databases, and graph databases - and highlights their strengths and weaknesses. Examples of when to use NoSQL include logging, social computing insights, external data integration, content management, and real-time analytics where large volumes of simple data are involved and high performance at large scale is required. The document also reviews some popular NoSQL tools and how their performance can be monitored.

Uploaded by

Gilberto Boada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

46 views18 pages

Nosql Technologies: Performance Characteristics and Monitoring

Uploaded by

Gilberto Boada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Monitis white paper:

NoSQL
Technologies:
Performance Characteristics
and Monitoring
Contents
Executive Summary ..................................................................................................................................................................... 3

What type of storage should you use? ............................................................................................................................ 4

Here’s a short summary that might help you make your decision: ............................................................... 4

What’s wrong with RDBMSs? ................................................................................................................................................. 4

The Four Categories of NoSQL .............................................................................................................................................. 5

Some examples of when to use NoSQL .......................................................................................................................... 6

Specific NoSQL tools and how to monitor their performance ......................................................................... 7

Summary.......................................................................................................................................................................................... 17

About Monitis ............................................................................................................................................................................... 17

About Monitis Exchange ....................................................................................................................................................... 17

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 2

Executive Summary
NoSQL (originally meaning without SQL but generally now taken to mean Not only
SQL) database technologies are one of the biggest trends to emerge in the area of
large and distributed web applications in the last few years. Companies like Facebook,
Twitter, Digg, Amazon, LinkedIn and Google all use NoSQL in some way — so the
term has been in the news often. Both technologies (NoSQL and RDBMSs) can co-exist
and each has its place. There is an increasingly wide array of choices and in this white
paper we seek to explain some of the fundamental principles behind NoSQL and
review some of the more popular tools, including consideration of how to monitor
their operational performance.
NoSQL databases do not follow the relational storage model that has dominated
database technology for many years. They have emerged in response to the need for
distributed data stores for very large scale data needs (e.g. Facebook or Twitter
accumulate Terabits of data every day for millions of its users); they have no fixed
schema and no joins. One fundamental difference between the two types of system is
the predominant scaling model they use - relational database management systems
(RDBMS) generally “scale up” using faster and faster hardware and adding memory –
clustering of RDBMS systems tends to be complex and expensive. NoSQL tools, on the
other hand, have generally been designed to take advantage of “scaling out” – which
means spreading the load over many commodity systems.
NoSQL databases do not provide a high-level declarative query language like SQL in
order to avoid overhead in processing. Rather, querying these databases is data-model
specific. Many of the NoSQL platforms allow for RESTful interfaces to the data, while
others offer query APIs. The ideal applications for NoSQL technologies are those in
which there are potentially very large volumes of data that is relatively simple in
structure, applications that require high performance in managing those large data
sets and would benefit from use of scale-out approaches utilising simple and/or
transparent clustering mechanisms.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 3

What type of storage should you use?
Here’s a short summary that might help you make your decision:

NoSQL RDBMS

» Storage should be able to deal with very » Storage is expected to be high-load, too, but
high load it mainly consists of read operations
» You do many write operations on the » You want performance over a more
storage sophisticated data structure
» You want storage that is horizontally » You need powerful SQL query language
scalable
» Simplicity is good, as in a very simple query
language (without joins)

What’s wrong with RDBMSs?

Fundamentally, nothing, but they do have limitations that constrain what can be achieved in certain
scenarios. Consider these three problems with RDBMSs:

» RDBMSs use a table-based normalization approach to data, which can be a limited model. Certain
data structures cannot be represented without altering the data, programs, or both.
» They allow versioning or activities like: Create, Read, Update and Delete. For databases, updates
should never be allowed, because they destroy information. For at least some applications it is better
that when data changes the database should just add another record and note the previous value for
that record.
» Performance falls off as RDBMSs normalize data. The reason: Normalization requires more tables, table
joins, keys and indexes and thus more internal database operations are required to parse and execute
queries. Pretty soon, the database starts to grow into the terabytes, and with the overhead of the
RDBMS model that can be when things slow down.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 4

The Four Categories of NoSQL

1. Key-values Stores. The underlying principle here is the use of a hash table, in which there is a unique
key and a pointer to a particular item of data. The Key/value model is the simplest and easiest to
implement but it is inefficient when you are only interested in querying or updating part of a value,
among other disadvantages.

Strengths: Fast lookups

Weaknesses: Stored data has no schema

2. Column Family Stores. These were created to store and process very large amounts of data
distributed over many machines. There are still keys but they point to multiple columns, with
columns being arranged by column family.

Strengths: Fast lookups, good distributed storage of data

Weaknesses: Very low-level API

3. Document Databases. These were inspired by Lotus Notes and are similar to key-value stores. The
model is basically versioned documents that are collections of other key-value collections. The semi-
structured documents are stored in formats like JSON. Document databases are essentially the next
level of Key/value, allowing nested values associated with each key. Document databases support
querying more efficiently.

Strengths: Tolerant of incomplete data

Weaknesses: Query performance, no standard query syntax

4. Graph Databases. Instead of tables of rows and columns and the rigid structure of SQL, a flexible
graph model is used which, again, can scale across multiple machines.

Strengths: Graph algorithms e.g. shortest path, connectedness, n degree relationships, etc.
Weaknesses: Has to traverse the entire graph to achieve a definitive answer. Not easy to cluster.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 5

Some examples of when to use NoSQL

» Logging/Archiving. Log-mining tools are handy because they can access logs across servers, relate
them and analyze them.
» Social Computing Insight. Many enterprises today have provided their users with the ability to do
social computing through message forums, blogs etc.
» External Data Feed Integration. Many companies need to integrate data coming from business
partners. Even if the two parties conduct numerous discussions and negotiations, enterprises have
little control over the format of the data coming to them. Also, there are many situations where those
formats change very frequently – based on the changes in the business needs of partners.
» Enterprise Content Management Service. Content Management is now used across companies’
different functional groups, for instance, HR or Sales. The challenge is bringing together different
groups using different meta data structures in a common content management service.
» Real-time stats/analytics. Sometimes it is necessary to use the database as a way to track real-time
performance metrics for websites (page views, unique visits, etc.) Tools like Google Analytics are
great but not real-time — sometimes it is useful to build a secondary system that provides basic real-
time stats. Other alternatives, such as 24/7 monitoring of web traffic, are a good way to go, too.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 6

Specific NoSQL tools and how to monitor their performance

Apache Cassandra

Apache Cassandra - like so many other NoSQL tools – is an open-source distributed database system. It was
originally created at Facebook in 2008 – but with much input from other sources, notably Google (BigTable)
and Amazon (Dynamo).

Cassandra has a very flexible schema. As originally described in the Google “BigTable” paper, Cassandra offers
the organization of a traditional RDBMS table layout combined with the flexibility and power of no stringent
structure requirements. This allows you to store your data as you need to – without a performance penalty for
changes, which can be important as your storage needs evolve over time.

Cassandra is based on a key-value model. A database consists of column families. A column family is a set of
key-value pairs. Drawing an analogy with relational databases, you can think about column family as table,
and separately, a key-value pair as a record in a table. A table in Cassandra is a distributed multi-dimensional
map indexed by a key. Cassandra can handle maps with four or five dimensions:

Map with 4 dimensions Map with 5 dimensions

1. Keyspace -> Column Family 1. Keyspace -> Super Column Family

2. Column Family -> Column Family Row 2. Super Column Family -> Super Column
3. Column Family Row -> Columns Family Row
4. Column -> Data Value 3. Super Column Family Row -> Super
Columns
4. Super Columns -> Columns
5. Column -> Data Value

Cassandra has the ability to scale its read/write throughput linearly with machine count based on its unique
architecture, a very compelling benefit when data volumes are expected to grow rapidly. A collection of
Cassandra nodes together is called a ring and in a ring each Cassandra node does exactly the same thing. To
make all this scaling happen seamlessly, Cassandra nodes employ a gossip protocol which enables the nodes
to talk to each other and figure out where in the ring to send reads and writes.

Like many highly scalable database systems, Cassandra uses data partitioning to achieve some of its incredible
speed characteristics. Cassandra’s default behavior is to partition data randomly across the ring. Incredibly,
even with your data partitioned, you can add and remove new nodes from your Cassandra ring without
compromising availability or consistency!

So with a multitude of nodes in your Cassandra ring how do you monitor this massive system? Do you even
need to monitor it? Can it be monitored? Yes, yes, and yes.

Cassandra comes with a very nifty command line tool called nodetool. Nodetool can help you monitor your
ring extremely well by plugging into JMX, the Java Management Extensions. This enables you to gather very
detailed information about your ring. You can get details about individual column families or about the entire
ring itself!

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 7

One very cool thing about Cassandra is that you only have to ask one node for information about the entire
ring. For example you can try:

bin/nodetool-host10.176.0.146 ring
Address Status Load Range Ring
10.176.0.146 Up 459.27 MB 75603446264197340449435394672681112420 |<--|
10.176.1.161 Up 382.53 MB 137462771597874153173150284137310597304 | |
10.176.1.162 Up 511.34 MB 63538518574533451921556363897953848387 |-->|

This provides the basis that can be used to monitor Cassandra but it needs a supporting framework to make it
useful as a production tool. Fortunately, we have done a little bit of the work for you and created an open
source Monitis-Cassandra project that can help you monitor your Cassandra clusters in style. This captures the
data from nodetool output and uploads It to Monitis’ cloud-based platform, where it is parsed and stored for
later display and analysis. Set the Monitis-Cassandra tool to run using cron or any other scheduling agent and
you have yourself a fully functioning monitoring system for your Cassandra cluster! Check back in to the
Monitis Web site to see all your fresh Cassandra metrics rolling in.

Using this combination of tools you can see how many nodes are currently running and how many are
currently active. This is important, and you might want to set up a Monitis alert to let you know whenever
one of your nodes goes down. (Keep in mind that Cassandra should still be running fine with a single node
failure, but it is definitely important to stay in the know.)

Since Cassandra is a Java application, you can also get information about the size of the heap space. You
might want to set an alert to go off if your Cassandra node is running out of memory because once it does it
will start paging to disk, which will impact performance dramatically and lose some of the benefit associated
with NoSQL in the first place.

Finally, there are some really fine-grained per column family metrics that come in through this open source
tool. You can get read and write latency for each of your column families – this is great! If your read latency is
moving above 200ms, then set an alert in Monitis to stay on top of the problem. You can set alerts on the
write latency as well. Fine-grained monitoring of individual Column Families is a great way to make sure that
your cluster is operating at peak performance.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 8

MongoDB

MongoDB combines the best features of key-value stores, document databases, object databases, and
relational database management systems (RDBMS). What that means is that MongoDB shards automatically
(as with a key-value store), allows JSON-based dynamic schema documents and offers a rich query language
in the manner of an RDBMS. MongoDB also offers a MapReduce implementation feature.

When you use Mongo, you need to understand that it is extremely fast in its default configuration but that
this comes at a price – in its default configuration MongoDB is not very concerned if you lose data or if
replication fails or if your mongo instance crashes and loses an hour or so worth of data. This can be
overcome by altering configurations but would be a major cause for concern if you are storing your users data
and they expect their comments, financial reports or medical records to be in your system.

Monitoring MongoDB is very similar to monitoring a normal relational database. On any given machine you
need to monitor things like how much CPU it is taking up, how much RAM, how fast are the disks etc. If
Mongo starts running out of any of your core compute resources then there will inevitably be problems!

Mongo was built to scale. The current standard installation of Mongo supports Master/Slave Replication,
Replica Sets, and Auto-Sharding right out of the box. You can easily get into complex monitoring situations
when you need to monitor not just one database but an entire cluster.

Basically, we have two sets of statistics we would like to collect from our Mongo instances. Firstly there are
the basic computer stats that you need to collect for any machine in your fleet and secondly we want to
collect all these great DB stats from the Mongo HTTP Console and have a place to store and view them. That’s
where Monitis comes into play!

To get started download the Monitis Mongo code onto your mongo server:

git clone git://github.com/monitisexchange/Monitis-Linux-Scripts.git

Now you can set up the custom Mongo monitors. This will setup the following custom monitors in Monitis:

» Mongo Build Info

» Mongo Databases
» Mongo Server Status
» Mongo Memory
» Mongo Connections
» Mongo Op Counters
» Mongo Cursors
» Mongo Network
» Mongo Background Flushes

Now that you have your monitors setup, you can send data by running:

python send_data.py.

Schedule the code to run under Cron or any other scheduling agent and you’re done. Pretty simple and now
you can watch all you monitoring data stream in.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 9

BerkeleyDB

BerkeleyDB – the grand-daddy of NoSQL databases – started out as a project at UC Berkeley aimed at
providing a simple but powerful database management system for BSD Unix. The project was so successful
that it was soon spun off to create Sleepycat Software, which was aquired by Oracle in 2006.

In a nutshell, BerkeleyDB is a high-performance, lightweight, in-process database toolkit, providing full-blown

ACID transactions to high-concurrency applications that need them. While it does not come with a SQL
engine out-of-the-box, SQLite can be layered on top of it as an add-on. It is also possible to configure it for
RPC-based network access although that is rarely done. While in this article we will concentrate on the
“original” BerkeleyDB (or BDB for short), it is worth mentioning that Oracle has two other related offerings –
BerkeleyDB Java Edition – a pure Java embeddable version, and an XML Edition – which uses XML storage and
provides data access through XQuery, XPath or JSON standards.

BerkeleyDB supports transactions through a mechanism called write-ahead logging. In essence, this means
that changes are written to a log file before the database file is modified. At each checkpoint, the pending
transactions are “flushed” to the database file; if the write fails for some reason, the transaction is rolled back.
Nothing comes free though – in a multi-process or multi-threaded transactional application, the developer
has the responsibility to issue checkpoints periodically by using a dedicated thread or by running the
thedb_checkpoint utility.

When it comes to performance, BerkeleyDB’s cache size is the single most important configuration parameter.
In order to achieve maximum performance, the data most frequently accessed by your application should be
cached so that read and write requests do not trigger too much I/O activity.

BerkeleyDB databases are grouped together in application environments - directories which group data and
log files along with common settings used by a particular application. The database cache is global for all
databases in a particular environment and needs to be allocated at the time that environment is created.
Most of the time this is done programmatically. The issue of course is that, the optimal cache size is directly
related to the size of the data set, which is not always known at application design time. Many BerkeleyDB
applications end up storing considerably more data than originally envisioned. Even worse, some applications
do not explicitly specify a cache size at all, so the default cache size of 256kB is allocated – which is far too low
for many applications. Such applications suffer from degraded performance as they accumulate more data;
their performance can be significantly improved by increasing the cache size. Luckily, most of the time this
can be achieved without any code changes by creating a configuration file in your BerkeleyDB application
environment.

The db_stat tool can help you determine the size of the BerkeleyDB cache and the cache hit rate (the number
of pages retrieved from the cache as opposed to loaded from disk). Run the command in the directory
containing the BerkeleyDB environment (or use the -h switch to specify a directory):

$db_stat -m
264KB 48B Total cache size
...
100004 Requested pages found in the cache (42%)

Let’s see how Monitis can help you keep tabs on the cache hit rate and other critical database metrics. We use
a custom monitor created by a perl script,monitor_bdb.pl, which can be downloaded from our repository on
GitHub. To familiarize yourself with its parameters, run ‘moniror_dbd --help‘. As always, we encourage you to
look at the source code to understand its inner workings.

Once the monitor window opens, switch to the Line Chart view and select Cache Hit Rate from the selection
box on the left to display the graph:

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 10

Looking at the graph, it is obvious that the cache hit rate has decreased substantially in a short amount of
time but there is more to Monitis than pretty graphs – it can also alert you if a certain parameter exceeds or
falls below a predefined value. In our case, we could create an alert with a threshold of 75% cache hit rate.

The notification rule will be saved and Monitis will notify you if the cache hit rate falls below your chosen
threshold.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 11

HBase

HBase is really a clone (or a very close relative) of Google’s Bigtable and was originally created for use with
Hadoop. In fact, HBase is a subproject of the Apache Hadoop project. HBase offers database capabilities for
Hadoop, which means you can use it as a source or sink for MapReduce jobs. HBase is a column-oriented
database, and it is built to provide low latency requests on top of Hadoop HDFS. Unlike some other columnar
databases that provide eventual consistency, HBase is very consistent

HBase supports unstructured and partially structured data. To do so, data is organized into column families.
You address an individual record, called a “cell” in HBase, with a combination of row key, column family, cell
qualifier, and time stamp. As opposed to RDBMS (relational database management systems), in which you
must define your table well in advance, with HBase you can simply name a column family and then allow the
cell qualifiers to be determined at runtime. This lets you be very flexible and supports an agile approach to
development.

Clustering in HBase is less transparent than in some other NoSQL tools and uses several kinds of servers. HDFS
needs at least one namenode and several datanodes, plus, HBase needs a ZooKeeper cluster, a master and
several region servers. Requests must be made to the master(s). On the HDFS level, existing data are not
sharded automatically. However, new data is sharded. On the HBase level, data is divided into regions that are
sharded automatically across region servers.
.
If you choose, HBase will allow you to use Google’s Protobuf (Protocol Buffer) API as an alternative to XML.
Protobuf is a very efficient way of serializing data. It has a noticeable advantage in compactness, the same
data will be two to three times smaller than XML, and in being 20 to 100 times faster to parse than XML.

Interestingly, Facebook recently decided not to use Cassandra in future development and has adopted Hbase
because this new Messaging model requires more flexible replication – as HBase provides.

Monitoring HBase with Monitis and JMX

Like most products written in Java, both HBase and Hadoop contain built-in JMX instrumentation, which
theoretically allows us to use any JMX client to view their performance metrics. Naturally, Monitis has just
what the doctor ordered - a generic JMX agent that can be configured to monitor any JMX-enabled process
through a point-and-click web interface.

Enabling JMX in HBase

By default the HBase JMX interface is disabled, but it can be enabled with relatively few configuration changes
as explained in the official HBase documentation. (For security reasons and especially since we are not going
to use JMX to modify the metric values or invoke administrative operations on the MBeans, I highly
recommend that you do not add controlRole to jmxremote.passwd and jmxremote.access as the document
suggests. Also, make sure these two files are owned by the login under which the HBase daemons will run,
and that their permissions are 600, otherwise HBase will not start).

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 12

Installing the Monitis JMX Agent
The agent is implemented as a JEE web application and is packaged accordingly as a .war file. To download,
log on to your Monitis account and select Monitors -> Manage Monitors -> JMX Monitors:

The JMX Monitors window will open. At the bottom of it, you will find a link to download the JMX Agent:

The .war file should be deployed in a standard JEE servlet container. While I recommend Tomcat due to its
small footprint and ease of use, there are other options. If you already use an application server such as JBoss
or WebSphere, you can deploy the JMX agent in it. While Tomcat offers many ways to deploy a .war file, the
easiest one is simply to copy the .war file to Tomcat's deploy folder. If Tomcat is already running, you don't
need to restart it - it will pick up and deploy the new .war file automatically.

Another consideration is which machine to deploy the monitor on. The HBase master would be the natural
choice but your decision is going to be influenced by your specific network topology and corporate
standards. In any case, the JMX agent should be able to access the HBase master on TCP port 10101 and
optionally the Region Servers (data nodes) on port 10102. Additionally, Tomcat must be accessible on port
8080 (default).

Once the JMX agent .war file is deployed, go to http://<server_name>:8080/mon_jmx_agent.

Enter your Monitis credentials - the same ones you use to login to your account on monitis.com - and click
Login.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 13

Creating an HBase Monitor in Monitis

Once it logs you in, the JMX Agent will prompt you to enter an Agent Name.

The Agent Name is used to identify the JMX Agent instance uniquely within Monitis. The metadata about the
monitors is associated with your account on monitis.com and the JMX agent will automatically download and
run any existing monitors previously defined for this agent name. For this reason, you want to choose a
unique name for each JMX agent deployment. Once you enter a meaningful name and click Save, you should
see the JMX Parameters page:

Make sure you enter the correct JMX port number and credentials you configured in HBase and
click Submit to go to the next page:

Select the hadoop domain from the drop down - HBase's MBeans live there for historical reasons. (You may
also want to explore other domains - such as java.lang, which provides important information on JVM's
internals). Within the hadoop domain, select the HBase service -> RPC Statistics. The next screen shows an
impressive number of metrics:

Under Monitor Name enter something meaningful. This is how your monitor will appear on
monitis.com. Check interval is in minutes. Select the following attributes:

» getNumOps
» getAvgTime
» getMinTime
» getMaxTime
» putNumOps
» putAvgTime
» putMinTime
» putMaxTime

While most attribute names are self-explanatory, the MBean does not provide a meaningful description for
the attributes, so feel free to examine the JMX section of the HBase book. Once you have selected all the
metrics you are interested in, click on the Add Monitor button at the bottom of the page.

We are now ready to log on to Monitis and examine the data collected by our newly created monitor. If you
are just logging on, Monitis will prompt you to add the new monitor, otherwise go to Monitors -> Manage
Monitors -> JMX Monitors to open the familiar JMX Monitors screen:

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 14

Select the check box next to the new monitor and click Add to Window to open a new monitor window:

That's it! As with any monitor, you can choose between multiple views and define notifications for your HBase
performance metrics:

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 15

And finally, a few words about architecture. First, the JMX agent's collector is implemented in the web
application (war file). For this reason, you want to make sure that Tomcat (or whatever application server you
deployed the agent's war file on) keeps running for as long as you need to collect data. You may want to
monitor the application server process itself and make sure it starts up automatically when the system is
booted. Second, when you log on to the JMX Agent's web interface, your credentials are submitted over an
unencrypted HTTP connection (at least with the default Tomcat setup). This might be OK if you are viewing
over the corporate LAN, otherwise you should look into enabling HTTPS on Tomcat. Alternatively, you could
front Tomcat with Apache and let it do the heavy lifting.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 16

Summary
NoSQL database technology is an increasingly important component of many leading-edge web application
projects and there is a range of choices available to suit different application requirements. Since these tools
are generally well-suited to high volume and/or update rate applications the data they manage can grow
rapidly and this can lead to operational issues that, left uncorrected, could start to impact overall performance
and lose many of the gains anticipated from their use. Understanding the characteristics of the tools chosen
and how to monitor them effectively will provide significant benefits in identifying emerging issues and in
diagnosing the underlying causes, allowing corrective action to be taken in a timely manner.

About Monitis
Monitis is the leading provider of Cloud-based Application Performance Management & Monitoring solutions
for System Admins and Web Developers. Over 80,000 users worldwide have chosen Monitis to increase
uptime and user experience of their services and products.

Monitis’ core product offerings include website monitoring, website full page load testing, transaction
monitoring, application and database monitoring, cloud resource monitoring, and server and internal
network monitoring. What makes Monitis’ software different is how fast it is to deploy, its flexible pricing and
feature-rich technology that provides a comprehensive single-pane view of on-premise and off-premise
infrastructure and applications.

For more information on Monitis please visit our website: www.monitis.com.

About Monitis Exchange

Monitis Exchange is an open source repository of uptime and performance monitoring scripts, hosted on
github, a publicly available repository of popular open source products. At
https://github.com/monitisexchange/ we host multiple Windows, Linux, Mac scripts for monitoring almost all
components of complex application delivery ecosystems.

The repository is open for community contributions; the code is provided as open source and developers are
free to modify and extend it as they wish. Contributors have already provided monitoring scripts for:

» Popular SQL and NoSQL databases such as Cassandra, MongoDB, Berkley DB and HBase
» Cloud services such as Amazon EC2 and Azure
» Web servers such as Apache, Nginx and IIS
» Microsoft products such as SharePoint, MS Exchange and universal WMI monitor
» Important open source products including memcached and Node.js
» Virtual platforms like VMware, XEN etc.

The Monitis Exchange also hosts open source SDKs for Java, PHP, Perl, Python, Ruby which use the Monitis
open API (http://monitis.com/api/api.html) to create custom monitors.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 17

MONITIS US HEADQUARTERS
2880 Zanker Road, Suite 203, San Jose, CA 95134, USA
Telephone: +1-800-657-7949
Telephone: +1-800-657-7949
Email: [email protected]
Email: [email protected]

Content by Monitis is licensed under a Creative Commons Attribution 3.0 Unported License [http://creativecommons.org/licenses/by/3.0/].

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 18

Unit 2 - Big Data Analytics - CCS334
No ratings yet
Unit 2 - Big Data Analytics - CCS334
36 pages
Cohesity ServiceNow Integration-User Guide
No ratings yet
Cohesity ServiceNow Integration-User Guide
29 pages
Unit-I Remaining HM
No ratings yet
Unit-I Remaining HM
32 pages
NoSQL Databases Notes
No ratings yet
NoSQL Databases Notes
5 pages
BDA Module 5 - Part1 (No SQL) 2023
No ratings yet
BDA Module 5 - Part1 (No SQL) 2023
32 pages
NOSQL Database
No ratings yet
NOSQL Database
10 pages
NOSQL
No ratings yet
NOSQL
25 pages
NoSQL Group1
No ratings yet
NoSQL Group1
15 pages
Unit 3 - Bda
No ratings yet
Unit 3 - Bda
36 pages
MongoDB Top 7 NoSQL Considerations
100% (1)
MongoDB Top 7 NoSQL Considerations
18 pages
NoSQL Databases
No ratings yet
NoSQL Databases
10 pages
No SQL
No ratings yet
No SQL
12 pages
BDA Unit2 Complete
No ratings yet
BDA Unit2 Complete
56 pages
Advance Database
No ratings yet
Advance Database
5 pages
Nosql Database
No ratings yet
Nosql Database
8 pages
CloudComputing DATABASE
No ratings yet
CloudComputing DATABASE
27 pages
Unit 3 Nosql Databases Adt
No ratings yet
Unit 3 Nosql Databases Adt
64 pages
Unit Iii
No ratings yet
Unit Iii
22 pages
BDA Unit-3
No ratings yet
BDA Unit-3
13 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
NoSQL Lec
No ratings yet
NoSQL Lec
45 pages
Unit 3
No ratings yet
Unit 3
10 pages
Overview of NoSQL
No ratings yet
Overview of NoSQL
17 pages
Nosql Databases
No ratings yet
Nosql Databases
2 pages
Why Nosql - Ibm
No ratings yet
Why Nosql - Ibm
6 pages
Unit 6
No ratings yet
Unit 6
143 pages
Unit 2 Evaluating NoSQL
No ratings yet
Unit 2 Evaluating NoSQL
64 pages
NoSQL DATABSES
No ratings yet
NoSQL DATABSES
12 pages
NoSQL MongoDB HBase Cassandra
100% (1)
NoSQL MongoDB HBase Cassandra
142 pages
Bda Unit-5 PDF
No ratings yet
Bda Unit-5 PDF
83 pages
NOSQL Data Management
No ratings yet
NOSQL Data Management
21 pages
Nosql Tricks
No ratings yet
Nosql Tricks
34 pages
Unit 2
No ratings yet
Unit 2
23 pages
NOs QL
No ratings yet
NOs QL
14 pages
Nosql Technology
No ratings yet
Nosql Technology
8 pages
Big Data Bhag 4 Changes
No ratings yet
Big Data Bhag 4 Changes
26 pages
NOSQL Concept 2
No ratings yet
NOSQL Concept 2
4 pages
Unit No 1
No ratings yet
Unit No 1
34 pages
Big Data Notes
No ratings yet
Big Data Notes
70 pages
BigData Unit2 V2
No ratings yet
BigData Unit2 V2
70 pages
41 NoSQL Introduction
No ratings yet
41 NoSQL Introduction
18 pages
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
No ratings yet
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
30 pages
Ebook Database Advice Guide
No ratings yet
Ebook Database Advice Guide
19 pages
Unit 3
No ratings yet
Unit 3
28 pages
Unit 2 Bda
No ratings yet
Unit 2 Bda
28 pages
UNIT 5 NoSql DBMS Notes
No ratings yet
UNIT 5 NoSql DBMS Notes
19 pages
NoSQL DATABASE-B
No ratings yet
NoSQL DATABASE-B
4 pages
Unit 1 Notes in NoSQL
No ratings yet
Unit 1 Notes in NoSQL
20 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
1 page
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
15 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
No SQL - Types, CAP Theorem
No ratings yet
No SQL - Types, CAP Theorem
12 pages
No SQL
No ratings yet
No SQL
12 pages
DBMS Unit2
No ratings yet
DBMS Unit2
26 pages
Database Advice Guide
No ratings yet
Database Advice Guide
19 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
22 pages
Instant Ebooks Textbook Unlocking DBT: Design and Deploy Transformations in Your Cloud Data Warehouse Cameron Cyr Download All Chapters
100% (1)
Instant Ebooks Textbook Unlocking DBT: Design and Deploy Transformations in Your Cloud Data Warehouse Cameron Cyr Download All Chapters
57 pages
Accident Severity
No ratings yet
Accident Severity
51 pages
SPPU DBMS UT1 Sem 5
No ratings yet
SPPU DBMS UT1 Sem 5
3 pages
1.fitting of Straight Line
No ratings yet
1.fitting of Straight Line
1 page
Project Report On Gym Management
100% (2)
Project Report On Gym Management
30 pages
Resume: Sajja Lakshmi Ganapathi
No ratings yet
Resume: Sajja Lakshmi Ganapathi
7 pages
Dbms 01 Project
No ratings yet
Dbms 01 Project
25 pages
Mulesoft Interview Questions
No ratings yet
Mulesoft Interview Questions
4 pages
Google - Testinises.professional Cloud Architect - Dumps.2024 Jul 16.by - Burgess.119q.vce
No ratings yet
Google - Testinises.professional Cloud Architect - Dumps.2024 Jul 16.by - Burgess.119q.vce
13 pages
SQL01 - Introduction
No ratings yet
SQL01 - Introduction
16 pages
The Complete Servicenow System Administrator Course: Section 5 - Tables & Fields
No ratings yet
The Complete Servicenow System Administrator Course: Section 5 - Tables & Fields
23 pages
Oracle Full Table Scan Tips: Full Scan I/O Is Cheaper Than Index I/O
No ratings yet
Oracle Full Table Scan Tips: Full Scan I/O Is Cheaper Than Index I/O
10 pages
Python Project Code Word For Cbse 12th Grocery Management
100% (1)
Python Project Code Word For Cbse 12th Grocery Management
36 pages
Module 4 PDF
No ratings yet
Module 4 PDF
38 pages
Quiz 1 Materials
No ratings yet
Quiz 1 Materials
159 pages
Chapter05 - More Complex SQL
No ratings yet
Chapter05 - More Complex SQL
49 pages
Hui-Csc205 Course 2024-2025 Session
No ratings yet
Hui-Csc205 Course 2024-2025 Session
18 pages
Topic 1 Conditional Formatting
No ratings yet
Topic 1 Conditional Formatting
5 pages
Formal-Relational Query Languages: Practice Exercises
No ratings yet
Formal-Relational Query Languages: Practice Exercises
4 pages
Complex Queries in SQL
No ratings yet
Complex Queries in SQL
31 pages
DWM Important Answer
No ratings yet
DWM Important Answer
8 pages
NCA 6.5 Demo
No ratings yet
NCA 6.5 Demo
5 pages
140+ SQL Interview Questions and Answers (2022) - Great Learning
No ratings yet
140+ SQL Interview Questions and Answers (2022) - Great Learning
60 pages
Unit 2: Log File Management: Control Files
No ratings yet
Unit 2: Log File Management: Control Files
11 pages
Commission Sheet Template 07
No ratings yet
Commission Sheet Template 07
9 pages
VSE+InfoScale Enterprise OracleRAC 2020 05
No ratings yet
VSE+InfoScale Enterprise OracleRAC 2020 05
89 pages
SAS - (Statistical Analysis System)
No ratings yet
SAS - (Statistical Analysis System)
11 pages
SE GTU Study Material Presentations Unit-8 29092020053751AM
No ratings yet
SE GTU Study Material Presentations Unit-8 29092020053751AM
15 pages
Table Creation Pract 1 PDF
No ratings yet
Table Creation Pract 1 PDF
3 pages

Nosql Technologies: Performance Characteristics and Monitoring

Uploaded by

Nosql Technologies: Performance Characteristics and Monitoring

Uploaded by

Monitis white paper:

What type of storage should you use? ............................................................................................................................ 4

What’s wrong with RDBMSs? ................................................................................................................................................. 4

The Four Categories of NoSQL .............................................................................................................................................. 5

Some examples of when to use NoSQL .......................................................................................................................... 6

Specific NoSQL tools and how to monitor their performance ......................................................................... 7

About Monitis ............................................................................................................................................................................... 17

About Monitis Exchange ....................................................................................................................................................... 17

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 2

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 3

What’s wrong with RDBMSs?

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 4

Strengths: Fast lookups

Strengths: Fast lookups, good distributed storage of data

Strengths: Tolerant of incomplete data

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 5

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 6

Map with 4 dimensions Map with 5 dimensions

1. Keyspace -> Column Family 1. Keyspace -> Super Column Family

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 7

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 8

git clone git://github.com/monitisexchange/Monitis-Linux-Scripts.git

» Mongo Build Info

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 9

In a nutshell, BerkeleyDB is a high-performance, lightweight, in-process database toolkit, providing full-blown

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 10

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 11

Monitoring HBase with Monitis and JMX

Enabling JMX in HBase

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 12

Once the JMX agent .war file is deployed, go to http://<server_name>:8080/mon_jmx_agent.

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 13

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 14

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 15

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 16

For more information on Monitis please visit our website: www.monitis.com.

About Monitis Exchange

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 17

Monitis | White paper – NoSQL Technologies: Performance Characteristics and Monitoring 18

You might also like