0% found this document useful (0 votes)
539 views398 pages

Considerations For Using NoSQL Technology On Your Next IT Project

Originally presented at: British Computer Society (BCS) SPA-270, London, UK, 6 February 2013 http://www.bcs-spa.org/cgi-bin/view/SPA/NoSqlDatabasesForBigData

Uploaded by

Akmal Chaudhri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
539 views398 pages

Considerations For Using NoSQL Technology On Your Next IT Project

Originally presented at: British Computer Society (BCS) SPA-270, London, UK, 6 February 2013 http://www.bcs-spa.org/cgi-bin/view/SPA/NoSqlDatabasesForBigData

Uploaded by

Akmal Chaudhri
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 398

Considerations for using

{"no":"SQL"} technology on
your next IT project
Akmal B. Chaudhri
(艾克摩 曹理)
Download the PDF file

• This presentation contains high-resolution graphics


• Download the PDF file for the best viewing experience
• Slides last updated on 2 August 2015

Source: Shutterstock Image ID 112849948


Abstract
Over the past few years, we have seen the emergence
and growth in NoSQL technology. This has attracted
interest from organizations looking to solve new business
problems. There are also examples of how this
technology has been used to bring practical and
commercial benefits to some organizations. However,
since it is still an emerging technology, careful
consideration is required in finding the relevant
developer skills and choosing the right product. This
presentation will discuss these issues in greater detail. In
particular, it will focus on some of the leading NoSQL
products and discuss their architectures and suitability
for different problems
Why it’s important
Half of the “NoSQL” databases and “big
data” technologies that are hot buzzwords
won’t be around in 15 years.
-- Michael O. Church

Source: http://lifehacker.com/what-i-wish-i-knew-when-i-started-my-career-as-a-softwa-1681002791/
Agenda
In a packed program ...
• Introduction
• Market analysis
• NoSQL
• Security and vulnerability
• Polyglot persistence
• Benchmarks and performance
• BI/Analytics
• NoSQL alternatives
• Summary
• Resources
In a packed program ...
• Introduction
• Market analysis
• NoSQL
• Security and vulnerability
• Polyglot persistence
• Benchmarks and performance
• BI/Analytics
• NoSQL alternatives
• Summary
• Resources
In a packed program ...
• Introduction
• Market analysis
• NoSQL
• Security and vulnerability
• Polyglot persistence
• Benchmarks and performance
• BI/Analytics
• NoSQL alternatives
• Summary
• Resources
In a packed program ...
• Introduction
• Market analysis
• NoSQL
• Security and vulnerability
• Polyglot persistence
• Benchmarks and performance
• BI/Analytics
• NoSQL alternatives
• Summary
• Resources
In a packed program
• Introduction
• Market analysis
• NoSQL
• Security and vulnerability
• Polyglot persistence
• Benchmarks and performance
• BI/Analytics
• NoSQL alternatives
• Summary
• Resources
Introduction
My background
• ~25 years experience in IT • Client-facing roles
– Developer (Reuters) – Developers
– Academic (City University) – Senior executives
– Consultant (Logica) – Journalists
– Technical Architect (CA)
– Senior Architect (Informix) • Broad industry experience
– Senior IT Specialist (IBM)
– TI (Hortonworks)
• Community outreach
– SA (DataStax)
• University relations
• Worked with various
technologies
– Programming languages • 10 books, many presentations
– IDE
– Database Systems
History
Have you run into limitations with
traditional relational databases? Don’t
mind trading a query language for
scalability? Or perhaps you just like shiny
new things to try out? Either way this
meetup is for you.
Join us in figuring out why these new
fangled Dynamo clones and BigTables
have become so popular lately.

Source: http://nosql.eventbrite.com/
Your path leads to NoSQL?

SQL

SQL
SQL

Source: Shutterstock Image ID 159183185


Source: Shutterstock Image ID 99862922
Gartner hype curve

NoSQL
Magic quadrant

hot

SQL

lame
DB

ugly cool

Source: After “say No! No! and No! (=NoSQL Parody)” Jens Dittrich (2013)
Magic quadrant 2013
Challengers Leaders

IBM,  
EnterpriseDB,  
Microso4,  
InterSystems  
Oracle,  SAP  

Others   Aerospike  

Niche players Visionaries

Source: “Magic Quadrant for Operational Database Management Systems” Gartner (21 October 2013)
Magic quadrant 2014
Challengers Leaders

IBM,  Microso4,  
Oracle,  SAP  
MongoDB   EnterpriseDB,  
InterSystems,  
MariaDB,  
MarkLogic  

Aerospike,  
Others   Couchbase,  
DataStax  

Niche players Visionaries

Source: “Magic Quadrant for Operational Database Management Systems” Gartner (16 October 2014)
Magic quadrant for dummies

Source: Oliver Widder, used with permission


Innovation adoption lifecycle

Source: http://en.wikipedia.org/wiki/Technology_adoption_lifecycle
Crossing the chasm

Chasm
1990s

OO  Databases  Predicted  Growth  


1800  
1600  
1400  
1200  
US$  Million  

1000  
800  
600  
400  
200  
0  
1996   1997   1998   1999   2000  
2000s

XML  Databases  Predicted  Growth  


800  
700  
600  
US$  Million  

500  
400  
300  
200  
100  
0  
1999   2000   2001   2002   2003   2004  
Today

NoSQL  Databases  Predicted  Growth  


1200  

1000  

800  
US$  Million  

600  

400  

200  

0  
2012   2013   2014   2015   2016  
The way developers really think

XML
NoSQL

OO
OO vs. Relational

Source: Inspired by comments from Esther Dyson during the 1990s


XML vs. Relational

Source: Inspired by “Tamino -- What is it good for?” Curtis Pew (2003)


NoSQL vs. Relational

Source: Inspired by http://www.slideshare.net/mongodb/webinar-the-opex-business-plan-for-nosql/ and


http://www.slideshare.net/lj101197/couchbase-overview033113long/
But ...
Relational flexibility

Source: Shutterstock Image ID 73381360


Welcome to 1985

Application Application

Relational NoSQL
database system database system

Source: After “NoSQL and the responsibility shift” Denshade (14 March 2015)
“MongoDB is web scale”
It may surprise you that there are a
handful of high-profile websites still using
relational databases and in particular
MySQL.

Source: http://mongodb-is-web-scale.com [WARNING: strong language]


NoSQL is developer-friendly
Other Stakeholders

Developers
But ...
Riak ... We’re talking about nearly a year
of learning.[1]

Things I wish I knew about MongoDB a


year ago[2]

I am learning Cassandra. It is not easy.[3]

[1] http://productionscale.com/blog/2011/11/20/building-an-application-upon-riak-part-1.html
[2] http://snmaynard.com/2012/10/17/things-i-wish-i-knew-about-mongodb-a-year-ago/
[3] http://planetcassandra.org/blog/post/datastax-java-driver-for-apache-cassandra
And ...
... it takes 1-3 years to get an enterprise
application onto a new data platform like
Cassandra ... Cassandra requires a
complete re-thinking of the data model
which many find challenging.
-- Shanti Subramanyam

Source: http://www.orzota.com/cassandra-summit-2013/
And ...
Going from being a company where most
people spent their entire careers using
relational databases ... to NoSQL
structure, we then ended up creating
problems for ourselves ... So with
hindsight I would have thought more about
the organisational preparedness.
-- Keith Pritchard

Source: “JPMorgan consolidates derivative trade systems with NoSQL database” Matthew Finnegan
(12 March 2015)
Moving corporate data ...
200 ft.

100 ft.

9 miles

Source: Shutterstock Image ID 163030709


Moving corporate data
• Moving water from one big tank to another
without losing a single drop
– Reading from Relational and writing to NoSQL
• The amount of information currently stored in
NoSQL databases would not quench a thirst on
a hot day
• Dante has reserved a special place in hell for
NoSQL database vendors
– Moving water from one big tank into another using
just a small spoon between their teeth

Source: Adapted from “COM and DCOM” Roger Sessions (1997)


But ...
• Riak at the National Health Service (UK)
– New DBMS needs 10-12 people to manage it,
compared to over 100 for the old systems
– Cost of infrastructure supporting new DBMS reduced
to ~5% of the old systems
– Lookup times for patient records significantly reduced
from seconds to milliseconds

Source: “Time to Take Another Look at NoSQL” Philip Carnelley (3 October 2014)
NoSQL hoopla and hype

Source: Getty Image ID WCO_030


Source: Shutterstock Image ID 92042489
Source: Inspired by “The Next Big Thing 2012” The Wall Street Journal (27 September 2012)
Source: Inspired by “NoSQL takes the database market by storm” Brandon Butler (27 October 2014)
Source: Inspired by http://www.marketresearchmedia.com/?p=568 and
http://www.pr.com/press-release/613495
Source: Inspired by http://dilbert.com/strips/comic/1995-01-22/
Source: Inspired by http://vimeo.com/104045795/
Source: Inspired by https://www.youtube.com/watch?v=3MNIrKlQp2E
Source: Inspired by “MongoDB: Second Round” Thomas Jaspers (8 November 2012)
Source: Inspired by “Why MongoDB is Awesome” John Nunemaker (15 May 2010) and
“Why Neo4J is awesome in 5 slides” Florent Biville (29 October 2012)
Source: Inspired by http://slv.io/
Source: Inspired by “Saturday Night Live” Season 1 Episode 9 (1976)
Source: Inspired by the movie “Airplane!” (1980)
Past proclamations of the imminent
demise of relational technology
• Object databases vs. relational
– GemStone, ObjectStore, Objectivity, etc.
• In-memory databases vs. relational
– SolidDB, TimesTen, etc.
• Persistence frameworks vs. relational
– Hibernate, OpenJPA, etc.
• XML databases vs. relational
– BaseX, Tamino, etc.
• Column-store databases vs. relational
– Sybase IQ, Vertica, etc.
Market analysis
Database market size ...

35  
30  
30  

25  
US$  Billion  

20  

15  

10  

5  
0  
0  
NoSQL   Rela5onal  

Source: “2014 State of Database Technology” InformationWeek (March 2014)


Database market size
NoSQL is a small but growing segment of
the database market, according to 451
Research’s Matt Aslett, who predicts it at
about 2% of the size of the SQL market.
-- Brandon Butler

Source: “NoSQL takes the database market by storm” Brandon Butler (27 October 2014)
NoSQL market size
• Private companies do
not publish results
• Venture Capital (VC)
funding 10s/100s of
millions of US $
• NoSQL revenue
– $20 million in 2011[1]
– $184 million in 2012[2]
– $223 million in 2014[3]

[1] http://blogs.the451group.com/information_management/2012/05/
[2] http://www.cio.co.uk/insight/data-management/new-database-dawn/
[3] http://www.datanami.com/2015/04/02/booming-big-data-market-headed-for-60b/
NoSQL vendor revenue 2012

10gen  

DataStax  

Basho  

Couchbase  

Aerospike  

Neo  Technologies  

0   10   20   30   40  
US$  Million  

Source: http://wikibon.org/wiki/v/Big_Data_Vendor_Revenue_and_Market_Forecast_2012-2017
Source: http://twitpic.com/dzbq8b/
Source: http://www.odbms.org/2015/07/nosql-by-the-numbers/
2014 revenue vs. funding

1000   945  
900  
800  
700  
US$  Million  

600   514  
500  
400  
300  
200  
100  
0  
Revenue   Funding  

Source: http://www.odbms.org/2015/07/nosql-by-the-numbers/
Investment in NoSQL, NewSQL
Company $ (Million)
MongoDB 231
Couchbase 116
DataStax 83.7
Clustrix 59.3
Basho 32.5
FoundationDB 22.3
Aerospike 22

Source: http://swtrends.wordpress.com/2014/08/22/the-nosqlnow-conference-in-san-jose-this-week/
Recent investment in NoSQL
Company $ (Million)
MongoDB 311[1]
DataStax 189.7[1]
MarkLogic 173[2]
Couchbase 116
Basho 64[3]
Neo4j 44.1[4]
Redis Labs 28[5]

[1] http://venturebeat.com/2015/01/12/basho-funding/
[2] http://fortune.com/2015/05/12/marklogic-snags-102-million/
[3] http://www.idgconnect.com/abstract/9332/basho-enterprise-focus-winning-friends-funds/
[4] http://fortune.com/2015/02/03/datastax-acquisition-database-software/
[5] http://www.informationweek.com/big-data/big-data-analytics/redis-emerges-as-nosql-in-memory-
performer-/d/d-id/1321047
Vendor revenue example ...
The new funding, which values MongoDB
at $1.6 billion ... Wikibon estimates
MongoDB’s 2014 revenue at $46 million,
meaning the company is valued at
approximately 35-times lagging 12-month
revenue ...
-- Jeff Kelly

Source: http://premium.wikibon.com/the-challenges-of-building-a-thriving-nosql-start-up/
Vendor revenue example
MongoDB ... I would say if we could get to
20 to 25 per cent of our user base then we
would have a multi-billion dollar company;
[at the moment] it’s less than five per cent
-- Dev Ittycheria

Source: http://linkis.com/www.computing.co.uk/BCGJU/
Vendor profitability example
MongoDB ... Profitability is still at least a
couple years away, Chairman and Co-
founder Dwight Merriman told me in an
interview.
-- Ben Fischer

Source: http://www.bizjournals.com/sanjose/news/2014/06/25/mongodb-plays-long-game-in-big-ata.html
Number of customers
Company Customers
MongoDB 2500
DataStax 500
MarkLogic 500
Couchbase 450
Basho 200
Neo4j 150

Source: http://www.odbms.org/2015/07/nosql-by-the-numbers/
NoSQL job trends ...

Source: http://java.dzone.com/articles/nosql-job-trends-august-2014 (August 2014)


NoSQL job trends ...

Source: http://java.dzone.com/articles/nosql-job-trends-august-2014 (August 2014)


NoSQL job trends ...

Source: http://java.dzone.com/articles/nosql-job-trends-august-2014 (August 2014)


NoSQL job trends

Source: http://java.dzone.com/articles/nosql-job-trends-august-2014 (August 2014)


Percentage increase in job posting
for key Big Data skills in US
120  

100  

80  
60  
%  

60   35  
35   35  
40  
25  
20   45   40   35   35  
15  
0  
MongoDB   CouchDB   Neo4j   Cassandra   HBase  
2013   2014F  

Source: http://talentneuron.com/blog/2013/12/big-data-has-your-organization-taken-the-big-leap/
Most valuable IT skills in 2013
Skill $
1. R 115,531
2. NoSQL 114,796
3. MapReduce 114,396
4. PMBook 112,382
5. Cassandra 112,382
6. Omnigraffle 111,039
7. Pig 109,561
8. SOA 108,997
9. Hadoop 108,669
10. MongoDB 107,825

Source: http://marketing.dice.com/pdf/Dice_TechSalarySurvey_2014.pdf
Most valuable IT skills in 2014
Skill $
1. PaaS 130,081
2. Cassandra 128.646
3. MapReduce 127,315
4. Cloudera 126,816
5. HBase 126,369
6. Pig 124,563
7. ABAP 124,262
8. Chef 123,458
9. Flume 123,186
10. Hadoop 121,313

Source: http://marketing.dice.com/pdf/Dice_TechSalarySurvey_2015.pdf
Fastest growing tech skills

Puppet  
Cybersecurity  
Big  Data  
NoSQL  
Salesforce  
Hadoop  
JIRA  
Cloud  
Informa5on  Security  
Python  
0   20   40   60   80   100  
%  

Source: http://news.dice.com/2014/09/15/fastest-growing-tech-skills-dice-report/
NoSQL jobs in the UK (perm)
• Database and
Business Intelligence
– MongoDB (1644)
– Cassandra (720)
– Redis (351)
– CouchDB (181)
– Couchbase (166)
– HBase (131)
– RavenDB (120)
– Neo4j (86)

Source: http://www.itjobswatch.co.uk/jobs/uk/nosql.do (21 June 2015)


NoSQL jobs in the UK (contract)
• Database and
Business Intelligence
– MongoDB (549)
– Cassandra (203)
– Redis (102)
– CouchDB (71)
– Couchbase (68)
– HBase (30)
– RavenDB (15)
– MarkLogic (12)

Source: http://www.itjobswatch.co.uk/jobs/uk/nosql.do (21 June 2015)


NoSQL LinkedIn skills index ...

Source: https://blogs.the451group.com/information_management/2015/07/07/nosql-linkedin-skills-index-
june-2015/
NoSQL LinkedIn skills index

Source: https://blogs.the451group.com/information_management/2015/07/07/nosql-linkedin-skills-index-
june-2015/
NoSQL vs. the world ...

Source: After http://www.kchodorow.com/blog/2011/05/05/nosql-vs-the-world/ (11 October 2014)


NoSQL vs. the world ...

Source: After http://www.kchodorow.com/blog/2011/05/05/nosql-vs-the-world/ (11 October 2014)


NoSQL vs. the world

Source: After http://www.kchodorow.com/blog/2011/05/05/nosql-vs-the-world/ (11 October 2014)


DB-Engines ranking ...

12%  

Top  8  Rela5onal  
Top  8  NoSQL  

88%  

Source: http://db-engines.com/en/ranking/ (21 June 2015)


DB-Engines ranking ...

3%   Top  8  RelaNonal  
3%   2%  
4%   Oracle  
MySQL  
6%   31%   MS  SQL  Server  
PostgreSQL  

24%   DB2  
MS  Access  

27%   SQLite  
SAP  AS  

Source: http://db-engines.com/en/ranking/ (21 June 2015)


DB-Engines ranking

Top  8  NoSQL  
4%   4%  
5%   MongoDB  
5%   Cassandra  
Redis  
42%  
9%   HBase  
Memcached  
14%   Neo4j  
CouchDB  
17%  
Couchbase  

Source: http://db-engines.com/en/ranking/ (21 June 2015)


But ...
DB-Engines.com ... a popularity rating
based on web mentions/searches and
installation numbers are not the same
thing ...

Source: “Operationalizing the Buzz: Big Data 2013” EMA Research Report (November 2013)
Use of NoSQL products

4%  
4%  
Never  heard  of  
them  /  no  interest  
Inves5ga5ng  

51%   In  pilot  
41%  

In  produc5on  

Source: “State of Database Technology 2013” InformationWeek (April 2013)


NoSQL in enterprise apps

8%  
Not  likely  to  
consider  

27%   Ac5vely  /  
poten5ally  
considering  
65%   Currently  using  

Source: “Cloud Software: Where Next?” InformationWeek (August 2013)


NoSQL in use 2013

4%  
No  current  /  
15%   planned  use  
Planned  use  

19%   Used  on  a  limited  


62%   basis  
Used  extensively  

Source: “2014 Analytics, BI, and Information Management Survey” InformationWeek (November 2013)
NoSQL in use 2014

6%  
No  current  /  
planned  use  
18%  
Used  on  a  limited  
basis  
Planned  use  
56%  
20%  
Used  extensively  

Source: “2015 Analytics & BI Survey” InformationWeek (December 2014)


Does your company currently have
plans to adopt NoSQL?

No  plans  

Will  deploy  in  3+  years  

Will  deploy  in  2  to  3  years  

Will  deploy  in  1  to  2  years  

Currently  deploying  

Already  using  a  NoSQL  

0   10   20   30   40   50   60  
%  

Source: “The Real World of The Database Administrator” Elliot King (March 2015)
SQL, NoSQL or both?

4%  
4%  

Use  only  SQL  


Use  Both  
39%   53%   Use  only  NoSQL  
Use  Nothing  

Source: http://pages.zeroturnaround.com/Java-Tools-Technologies.html
Primary NoSQL technology

17%  
MongoDB  
3%  
Apache  Cassandra  
5%   Redis  
Hazelcast  
9%   56%  
Neo4j  
10%   Other  

Source: http://pages.zeroturnaround.com/Java-Tools-Technologies.html
Databases in use

MS  SQL  Server  
MS  Access  
Oracle  
MySQL  
DB2  
PostgreSQL  
FileMaker  
MongoDB  
Cassandra  
DynamoDB  
HBase  
Couchbase  
Riak  
Neo4j  

0   20   40   60   80  
%  

Source: “2014 State of Database Technology” InformationWeek (March 2014)


What database(s) does your
company currently use?
SQL  Server  
MySQL  
Oracle  
DB2  
PostgreSQL  
MongoDB  
Hadoop  
Cassandra  
Riak  
Couchbase  
0   10   20   30   40   50   60  
%  

Source: http://www.tesora.com/resources/infographic
Which databases does your
organization use?

MySQL  

Oracle  

SQL  Server  

PostgreSQL  

MongoDB  

0   10   20   30   40   50   60   70  
%  

Source: “Guide to Big Data” DZone Research (2014)


Databases used for most critical
functions
MS  SQL  Server  
Oracle  
MySQL  
DB2  
MS  Access  
PostgreSQL  
SAP  Sybase  ASE  
Teradata  
MongoDB  
0   10   20   30   40   50   60  
%  

Source: “2014 State of Database Technology” InformationWeek (March 2014)


What database brands do you have
running in your organization?

MS  SQL  Server  

Oracle  

MySQL  

DB2  

MongoDB  

0   20   40   60   80   100  
%  

Source: “The Real World of The Database Administrator” Elliot King (March 2015)
NoSQL, NewSQL, or non-relational
data store technology adoption
MongoDB  
SQLFire  
Cassandra  
HBase  
CouchDB/Couchbase  
SimpleDB  
BerkleyDB  
DataStax  
Redis  
DynamoDB  
MemSQL  
VoltDB  
Castle  
RavenDB  

0   10   20   30   40   50  
%  

Source: “2014 Data Connectivity Outlook” Progress Software (November 2013)


NoSQL or non-relational data store
technology adoption

MongoDB  
SimpleDB  
Cassandra  
HBase  
Couchbase  
DynamoDB  
Riak  

0   5   10   15   20   25   30  
%  

Source: “2015 Data Connectivity Outlook” Progress Software (April 2015)


When deploying new applications,
which database alternatives do you
evaluate?
MS  SQL  Server  
Oracle  
SAP  HANA  
IBM  DB2  
DataStax  
MongoDB  
HBase  

0   10   20   30   40   50   60   70  
%  

Source: Cowen and Company Mid-Year 2015 IT Spending Survey (May 2015)
Hosting example ...

DB  market  share  (%)  for  2014  


1%  

10%  
MySQL  
12%  
MariaDB  
PostgreSQL  
16%   61%   MongoDB  
CouchDB  

Source: http://blog.jelastic.com/2015/01/13/software-stacks-market-share-2014-summary/
Hosting example

DB  market  share  (%)  for  2013  -­‐  2014  


80  
70  
60  
50   MySQL  
40  
MariaDB  
30  
20   PostgreSQL  
10   MongoDB  
0  
CouchDB  
October  
November  

January  

March  
April  

September  
December  

July  
February  

August  

Source: Jelastic
Top 2013 DM topics

1%  
3%   2%  
Enterprise  IM  
NoSQL  
10%   24%  
Big  Data  
Data  Gov,  Quality  
12%  
Data  Modeling  
BI  /  Analy5cs  
17%  
15%   Data  Science  
Unstructured  Data  
16%  
Chief  Data  Officer  

Source: http://www.dataversity.net/top-20-hottest-data-management-posts-year-to-date-2014/
Top 2014 DM topics

3%   1%   1%  
3%   Enterprise  IM  
BI  /  Analy5cs  
9%   23%  
NoSQL  
Data  Gov,  Quality  
11%  
Data  Modeling  
Big  Data  
13%   21%  
Data  Strategy  
Data  Science  
15%  
Cogni5ve  Comp  

Source: http://www.dataversity.net/top-20-hottest-data-management-posts-year-to-date-2015/
NoSQL
Imitation is the sincerest form of
flattery - thank you Couchbase!
“The Stars, Like Dust”
... a squadron of small, flitting ships that
had struck and vanished, then struck
again, and made scrap of the lumbering
titanic ships that had opposed them ...
abandoning power alone, stressed speed
and co-operation ...
-- Isaac Asimov

Source: “The Stars, Like Dust” Isaac Asimov (1951)


NoSQL The Movie!

Sequel
History in No-tation

1970: NoSQL = We have no SQL

1980: NoSQL = Know SQL

2000: NoSQL = No SQL!

2005: NoSQL = Not only SQL

2013: NoSQL = No, SQL!

Source: “Perception is Key: Telescopes, Microscopes and Data” Mark Madsen (2013)
The meme changed

SQL Not
Only SQL
Why did NoSQL datastores arise?
• Some applications need very few database
features, but need high scale
• Desire to avoid data/schema pre-design
altogether for simple applications
• Need for a low-latency, low-overhead API to
access data
• Simplicity -- do not need fancy indexing -- just
fast lookup by primary key
NoSQL drivers

What  is  the  biggest  data  management  


problem  driving  your  use  of  NoSQL  in  the  
coming  year?  
Lack  of  flexibility  
Inability  to  scale  out  data  
High  latency  
Costs  
All  of  these  
Other  
0   10   20   30   40   50   60  
%  

Source: Couchbase NoSQL Survey (December 2011)


Eye on NoSQL 2013

NoSQL  not  priority  


Variable  data,  models  
Easier  management  
Fast,  flexible  dev  
High-­‐scale  web,  mobile  apps  
Lower  h/w,  storage  cost  
Lower  s/w,  deployment  cost  

0   10   20   30   40   50   60  
%  

Source: “2014 Analytics, BI, and Information Management Survey” InformationWeek (November 2013)
Eye on NoSQL 2014

NoSQL  not  priority  


Variable  data,  models  
Easier  management  
Fast,  flexible  dev  
High-­‐scale  web,  mobile  apps  
Lower  s/w,  deployment  cost  
Lower  h/w,  storage  cost  

0   10   20   30   40   50   60  
%  

Source: “2015 Analytics & BI Survey” InformationWeek (December 2014)


Schema-free

Source: Shutterstock Image ID 128628794


But ...
We started using mongo early 2009, and
even just one year out it feels so much
more painful to maintain than our Postgres
or MySQL systems that have been around
since 1999! My theory is that NoSQL
sacrifices maintenance and future
development effort for the sake of startup
development.
-- Luke Crouch
Source: http://bluke.blogspot.co.uk/2010/05/quick-blurb-on-nosql.html
And ...
Inquiries from Gartner clients indicate that
schema design for NoSQL DBMSs is one
of the biggest barriers to adopting this new
technology. Simply selecting a NoSQL
DBMS and hoping the underlying
technology will accommodate poor design
choices will lead to a poorly performing
application and database, and to rework.
-- Adam M. Ronthal and Nick Heudecker
Source: “Five Data Persistence Dilemmas That Will Keep CIOs Up at Night” Gartner (24 June 2015)
Schema

Source: Luke Crouch, used with permission


Big data
Variety Velocity Volume
Big data infrastructure

Source: “Analytics: The real-world use of big data” IBM and University of Oxford (October 2012)
Scenario where NoSQL is useful

A.N. Other ownsCar 2005 VW Polo

A.N. Other ownsHouse 123 High St, London

A.N. Other ownsComp 2014 MacBook Air


Brewer’s CAP “Theorem” ...

Enforced
ACID Consistency
CA CP

AP
A P
BASE

Source: After http://guide.couchdb.org/editions/1/en/consistency.html


Brewer’s CAP “Theorem”

CA CP

AP
A P
ACID vs. BASE ...

• Atomicity • Basically Available


• Consistency • Soft state
• Isolation • Eventual consistency
• Durability
Source: Shutterstock Image ID 196307495 and Shutterstock Image ID 196305647
ACID vs. BASE
ACID BASE
• Strong consistency • Weak consistency
• Isolation • Availability first
• Focus on “commit” • Best effort
• Nested transactions • Approximate answers OK
• Conservative (pessimistic) • Aggressive (optimistic)
• Availability • Simpler, faster
• Difficult evolution • Easier evolution

Source: After “Towards Robust Distributed Systems” Eric Brewer (2000)


But ...
... we find developers spend a significant
fraction of their time building extremely
complex and error-prone mechanisms to
cope with eventual consistency and
handle data that may be out of date. We
think this is an unacceptable burden to
place on developers and that consistency
problems should be solved at the
database level.
Source: http://research.google.com/pubs/archive/41344.pdf
Use the right tool

Source: http://www.sandraandwoo.com/2013/02/07/0453-cassandra/
Tuneable CAP
• Examples
– Cassandra
– MongoDB
– Riak
MongoDB speed vs. safety
Options WriteConcern Notes

w=0, j=0 UNACKNOWLEDGED Fire and Forget

Operation completed
w=1, j=0 ACKNOWLEDGED
successfully in memory
Operation written to the
w=1, j=1 JOURNALED
journal file

w=1, fsync=true FSYNCED Operation written to disk

Ack by primary and at least


w=2, j=0 REPLICA_ACKNOWLEDGED
one secondary
Ack by the majority of
w=majority, j=0 MAJORITY
nodes

Source: “MongoDB Replication” Philipp Krenn (2014)


MongoDB Replica Sets

Source: Adapted from “Don’t fight MongoDB” Mirko Bonadei (2013)


NoSQL
SQL

ACID
BASE
ACID
Shades of grey

Source: http://blog.mongodb.org/post/523516007/on-distributed-consistency-part-6-consistency-chart
Choices, choices

Source: Infochimps, used with permission


 451  Research:  Data  Plajorms  Landscape  Map  –  September  2014  
1   2   3   4   5   6  
Towards  
enterprise  search   Apache  Storm   Google    
SQLStream   Mortar   Treasure   Compute   AWS   Microsoo  
Apache  S4   HDInsight  
Lucene/Solr   DataTorrent   Data   Qubole   Data   Engine   EMR   Metascale  
Feedzai   Infochimps  
Metamarkets   T-­‐Systems   Zehaset   MapR   IBM     Hortonworks   Databricks/Spark  
A Towards   Sooware  AG   IBM   A
SRCH2   Al5scale  
E-­‐discovery   BigInsights   Oracle  Big  Data    
Guavus   InfoSphere  
Streams   Savvis  
Cloudera  
Appliance   ©  2014  by  451  Research  
HP  
Autonomy  
Elas5csearch   Lokad  
Google   TIBCO   Soolayer  
Cloud     StreamBase  
Rackspace  
Non-­‐relaNonal  zone   LLC.  All  rights  reserved    
Oracle     Azure   Dataflow   Verizon  
Apache   Apache  Apache   IBM  
Endeca  Server   Amvio   Search    
IBM   I nfoSphere    
AWS   xPlenty  
Kinesis   Trafodion   Splice  Machine  Tajo   Hive   Drill   Big  SQL   CitusDB   Pivotal  HD   SciDB   HPCC   Key:    
Towards  
Data  Explorer   NGDATA  Starcounter   SQLite   MammothDB   Presto   Impala   JethroData   Hadapt   Teradata     RainStor   General  purpose  
SIEM   Firebird  
Loggly   LucidWorks   Ac5an  Ingres   Aster   Specialist  analy5c  
Sumo   Big  Data   IBM  PureData  
B
Logentries  
TIBCO  
Logic   SAP  Sybase  ASE  
SAP  Sybase  SQL  Anywhere   RelaNonal  zone   for  Analy5cs  
B
-­‐as-­‐a-­‐Service  
BigTables  
LogLogic   EnterpriseDB   SQream  
Postgres-­‐XL   Microsoo   Graph  
Splunk   vFabric  Postgres   Oracle   IBM     SAP     SQL  Server   Oracle  
PostgreSQL   Exadata   PureData   HANA   PDW   Teradata   Exaly5cs   Document  
Sqrrl   Percona   MySQL   Key  value  stores  
MarkLogic   Enterprise     Ac5an   XtremeData  
MariaDB   MariaDB   Oracle   IBM   Informix   SQL     PSQL   Key  value  direct    
OrientDB   ArangoDB   Enterprise   Database   DB2   Progress   Metamarkets   Druid   access  
Exasol   Server   OpenEdge   Ac5an   V ector  
Ipedo  XML   Aerospike     ScaleDB   Hadoop  
Oracle  TimesTen   Kogni5o  
Database   MySQL  Cluster   MySQL     Spider  
Founda5onDB   VoltDB   Clustrix   Fabric   solidDB   MySQL  ecosystem  
Tamino   LucidDB  
FairCom   GenieDB   AWS  RDS   Advanced    
XML  Server   DataStax   Handlersocket   NuoDB   InfiniDB   OpenStack  Trove   Kx  Systems  
C Enterprise   InfiniSQL   ScaleBase   Ac5an  Matrix   C clustering/sharding  
Documentum   ScaleArc   Heroku  Postgres  
xDB   Infobright   IBM   I nfoSphere   New  SQL  databases  
YarcData   Datomic   Drizzle   Rackspace   ParStream  
Cassandra   Riak   SAP   Sybase  IQ  
UniData   Neo4J   FatDB   MemSQL   Tesora   Cloud   D atabases   Data  caching  
UniVerse   Hypertable   Couchbase   Tokutek   Google  Cloud  SQL   HP   Ver5ca  
Sparksee   JustOneDB   Pivotal  Greenplum   Data  grid  
Redis   CodeFutures   HP   C loud   R DB  
HBase   TransLamce  
Adabas  
IBM  IMS  
Accumulo  
FlockDB  
JumboDB  
Voldemort   Pivotal  SQLFire  
Con5nuent   114    for  MySQL  
FathomDB  
MonetDB  
LogicBlox  
Search  

GrapheneDB   RethinkDB   Al5base  HDB   Zimory  Scale   DeepDB   Appliances  


SQL  Azure   SpaceCurve  
WakandaDB   Cassandra.io   Oracle  NoSQL   WebScaleSQL   In-­‐memory  
BerkeleyDB   Al5base  XDB   Galera   AWS  
App  Engine   ClearDB   Stream  processing  
D
ObjectStore   Datastore   CouchDB   Redshio   D
Google   LevelDB   StormDB   InfluxDB   1010data  
Cloud      RavenDB   Database.com  
McObject   Datastore   HyperDex   Google     TempoIQ   BitYota  
Stardog   ObjectRocket   BigQuery  
Redis  
Titan   RedisGreen   Redis  Labs   AWS  
Ac5an   AffinityDB   Redis-­‐to-­‐go   CloudBird   Memcached  Cloud   IronCache   Elas5Cache  
Versant    
InterSystems  
Trinity   AWS  
SPARQLBASE   Elas5Cache  
MongoDB   MemCachier  
Compose  
Grid/cache  zone  
Caché   Giraph   with  Redis  
Redis   L abs   Iris  Couch   Ehcache   InfiniSpan  
Allegrograph   MongoLab   BigMemory  
E HypergraphDB   Redis  
MagnetoDB  
Cloud  
ObjectRocket   Memcached   Red  Hat  
E
Objec5vity   InfiniteGraph   SimpleDB   DocumentDB   IBM     JBoss  
Cloudant   ScaleOut   Pivotal   TIBCO   Oracle     eXtreme   Data  Grid  
AWS  DynamoDB   Sooware   GemFire   Ac5veSpaces   Coherence   Scale  
Lotus  Notes  
Source:1   451 Research,2   used with permission
3  
GridGain   GigaSpaces  XAP   Hazelcast  
4  
CloudTran  
5   6  
 451  Research:  Data  Plajorms  Landscape  Map  –  ~2009  
1   2   3   4   5   6  

SQLStream  
TIBCO  
Aleri  
A Towards   Coral8   A
E-­‐discovery   Progress  
Apama  
Towards  
BEA  
©  2014  by  451  Research  
 
Autonomy  
enterprise  search  
Lucene   StreamBase   Non-­‐relaNonal  zone   LLC.  All  rights  reserved    
Endeca   Solr  
Amvio  
Vivisimo   SQLite  
Key:    
Towards   General  purpose  
SIEM   Firebird  
Ingres   Specialist  analy5c  
Splunk  
B
LogLogic  
Lucid  
Imagina5on  
Sybase  ASE  
Sybase  SQL  Anywhere  
EnterpriseDB  
RelaNonal  zone   B
Data  grid/cache  
Search  
PostgreSQL   In-­‐memory  
MarkLogic   Netezza   Teradata  
MySQL   Stream  processing  
   
Oracle   IBM  Informix   SQL    
Database   DB2   Server   Pervasive  PSQL   ParAccel  
Progress  OpenEdge  
Ipedo  XML   Oracle  TimesTen  
Database   IBM  solidDB   Kx  Systems  
Tamino   MonetDB  
XML  Server  
C IBM  InfoSphere   C
Documentum  
xDB   Calpont  
UniData   Sybase  IQ  
UniVerse   Ver5ca  
Greenplum  
Adabas  
IBM  IMS  
114   Aster  Data  
VectorWise  
Infobright  
BerkeleyDB   1010data  
D Progress   D
ObjectStore   Kogni5o  
McObject  

 
Versant  
InterSystems  
Grid/cache  zone  
Caché  
E E
Objec5vity   IBM    
ScaleOut     TIBCO   Tangosol   eXtreme  
Sooware   GemStone   Ac5veSpaces   Coherence   Scale  
Lotus  Notes  
Source:1   451 Research,2   used with permission
3  
GigaSpaces  
4  
Terracoha   Memcached  
5   6  
How many systems? ...
There are a lot of Key/Value stores and
distributed schema-free Document
Oriented Databases out there. They’re
springing up like weeds in a spring garden.
And folks love to blog about them and/or
talk about how their favorite is better than
the others (or MySQL).
-- Jeremy Zawodny

Source: http://blog.zawodny.com/2010/03/28/nosql-is-software-darwinism/
How many systems?

KV  /  Tuple  Store  
17%   Document  Store  
3%   27%   Object  Databases  
Graph  Databases  
4%  
Column  Store  
4%  
Grid  and  Cloud  
7%   14%  
Mul5model  
11%   XML  Databases  
13%  
Other  

Source: http://nosql-database.org/ (24 March 2015)


Major categories of NoSQL ...

Type Examples

Document store

Column store

Key-Value store

Graph store
Source: 451 Research, used with permission
Major categories of NoSQL
Document store Column store

Document CF1: CF1: CF2: CF3:


Key (collection of K-V) Key C1 C2 C1 C1

Node 1 Node 2

Key Binary Data Key Properties Key Properties


Relationship 1
Key Properties

Key-Value store Graph store


Source: Ilya Katsov, used with permission
Popular NoSQL DBs
License Protocol API/Query Replication

Apache Thrift CQL, Thrift P2P

Apache REST/HTTP JSON, MR M-M

AGPL Proprietary BSON M-S, Shard

BSD Telnet-Like* Many Langs. M-S

Apache REST/HTTP JSON, MR P2P*

Source: “Big Data Projects: How to Choose NoSQL Databases” Thomas Casselberry (21 January 2015)
The rise of multi-model DBs
K-V Column Document Graph

✔ ✔ ✔

✔ ✔ ✔*

✔ ✔

✔ ✔
Commercialization examples
Document store
• Represent rich, hierarchical data structures,
reducing the need for multi-table joins
• Structure of the documents need not be known a
priori, can be variable, and evolve instantly, but
a query can understand the contents of a
document
• Use cases: rapid ingest and delivery for evolving
schemas and web-based objects
MongoDB example
{ {
"namespace 1": any json object, "namespace 1": [
"namespace 2": any json object, {
... "_id": "key 1",
} "property 1": "value",
"property 2": {
"property 3": "value",
"property 4": [ "value",
"value", "value" ]
}, ...
},
...
]
}

Source: Frank Denis, used with permission


Connection
private static final String DBNAME = "demodb";
private static final String COLLNAME = "people";
...
MongoClient mongoClient = new MongoClient("localhost", 27017);
DB db = mongoClient.getDB(DBNAME);
DBCollection collection = db.getCollection(COLLNAME);

System.out.println("Connected to MongoDB");
Create
BasicDBObject document = new BasicDBObject();

List<String> likes = new ArrayList<String>();


likes.add("satay");
likes.add("kebabs");
likes.add("fish-n-chips");

document.put("name", "akmal");
document.put("age", 40);
document.put("date", new Date());
document.put("likes", likes);

collection.insert(document);
Read
BasicDBObject document = new BasicDBObject();
document.put("name", "akmal");

DBCursor cursor = collection.find(document);

while (cursor.hasNext())
System.out.println(cursor.next());

cursor.close();
Update
BasicDBObject document = new BasicDBObject();
document.put("name", "akmal");

BasicDBObject newDocument = new BasicDBObject();


newDocument.put("age", 29);

BasicDBObject updateObj = new BasicDBObject();


updateObj.put("$set", newDocument);

collection.update(document, updateObj);
Delete
BasicDBObject document = new BasicDBObject();
document.put("name", "akmal");

collection.remove(document);
Connection
var async = require('async');
var MongoClient = require('mongodb').MongoClient;
MongoClient.connect("mongodb://localhost:27017/demodb",
function(err, db) {
if (err) {
return console.log(err);
}
console.log("Connected to MongoDB");
var collection = db.collection('people');
var document = {
'name':'akmal',
'age':40,
'date':new Date(),
'likes':['satay', 'kebabs', 'fish-n-chips']
};
Create
function (callback) {
collection.insert(document, {w:1}, function(err, result) {
if (err) {
return callback(err);
}
callback();
});
},
Read
function (callback) {
collection.findOne({'name':'akmal'}, function(err, item) {
if (err) {
return callback(err);
}
console.log(item);
callback();
});
},
Update
function (callback) {
collection.update({'name':'akmal'}, {$set:{'age':29}}, {w:1},
function(err, result) {
if (err) {
return callback(err);
}
callback();
});
},
Delete
function (callback) {
collection.remove({'name':'akmal'}, function(err, result) {
if (err) {
return callback(err);
}
callback();
});
},
Column store ...
• Manage structured data, with multiple-attribute
access
• Columns are grouped together in “column-
families/groups”; each storage block contains
data from only one column/column set to provide
data locality for “hot” columns
• Column groups defined a priori, but support
variable schemas within a column group
Column store
• Scale using replication, multi-node distribution
for high availability and easy failover
• Optimized for writes
• Use cases: high throughput verticals (activity
feeds, message queues), caching, web
operations
Cassandra example
{ {
"column family 1": { "column family 2": {
"key 1": { "super key 1": {
"property 1": "value", "key 1": {
"property 2": "value" "property 1": "value",
}, "property 2": "value"
"key 2": { },
"property 1": "value", "key 2": {
"property 4": "value", "property 1": "value",
"property 5": "value" "property 4": "value",
} "property 5": "value"
}, ... }, ...
} }, ...
}, ...
}

Source: Frank Denis, used with permission


Connection
Class.forName("org.apache.cassandra.cql.jdbc.CassandraDriver");
connection = DriverManager.getConnection(
"jdbc:cassandra://localhost:9160/demodb");

System.out.println("Connected to Cassandra");
Create
String query =
"BEGIN BATCH\n" +
"INSERT INTO people (name, age, date, likes) VALUES ('akmal', 40, '"
+ new Date() +
"', {'satay', 'kebabs', 'fish-n-chips'})\n" +
"APPLY BATCH;";

Statement statement = connection.createStatement();


statement.executeUpdate(query);
statement.close();
Read
String query = "SELECT * FROM people";

Statement statement = connection.createStatement();


ResultSet cursor = statement.executeQuery(query);

while (cursor.next())
for (int j = 1; j < cursor.getMetaData().getColumnCount()+1; j++)
System.out.printf("%-10s: %s%n",
cursor.getMetaData().getColumnName(j),
cursor.getString(cursor.getMetaData().getColumnName(j)));

cursor.close();
statement.close();
Update
String query =
"UPDATE people SET age = 29 WHERE name = 'akmal'";

Statement statement = connection.createStatement();


statement.executeUpdate(query);
statement.close();
Delete
String query =
"BEGIN BATCH\n" +
"DELETE FROM people WHERE name = 'akmal'\n" +
"APPLY BATCH;";

Statement statement = connection.createStatement();


statement.executeUpdate(query);
statement.close();
Key-Value store
• Simplest NoSQL stores, provide low-latency
writes but single key/value access
• Store data as a hash table of keys where every
key maps to an opaque binary object
• Easily scale across many machines
• Use-cases: applications that require massive
amounts of simple data (sensor, web
operations), applications that require rapidly
changing data (stock quotes), caching
Redis and Riak examples
{ {
database number: { "bucket 1": {
"key 1": "value", "key 1": document + content-type,
"key 2": [ "value", "value", "key 2": document + content-type,
"value" ], "link to another object 1": URI of
"key 3": [ other bucket/key,
{ "value": "value", "score": "link to another object 2": URI of
score }, other bucket/key,
{ "value": "value", "score": },
score }, "bucket 2": {
... "key 3": document + content-type,
], "key 4": document + content-type,
"key 4": { "key 5": document + content-type
"property 1": "value", ...
"property 2": "value", }, ...
"property 3": "value", ... }
}, ...
}
}

Source: Frank Denis, used with permission


Connection
Jedis j = new Jedis("localhost", 6379);
j.connect();

System.out.println("Connected to Redis");
Create
String id = Long.toString(j.incr("global:nextUserId"));

j.set("uid:" + id + ":name", "akmal");


j.set("uid:" + id + ":age", "40");
j.set("uid:" + id + ":date", new Date().toString());
j.sadd("uid:" + id + ":likes", "satay");
j.sadd("uid:" + id + ":likes", "kebabs");
j.sadd("uid:" + id + ":likes", "fish-n-chips");

j.hset("uid:lookup:name", "akmal", id);


Read
String id = j.hget("uid:lookup:name", "akmal");

print("name ", j.get("uid:" + id + ":name"));


print("age ", j.get("uid:" + id + ":age"));
print("date ", j.get("uid:" + id + ":date"));
print("likes ", j.smembers("uid:" + id + ":likes"));
Update
String id = j.hget("uid:lookup:name", "akmal");

j.set("uid:" + id + ":age", "29");


Delete
String id = j.hget("uid:lookup:name", "akmal");

j.del("uid:" + id + ":name");
j.del("uid:" + id + ":age");
j.del("uid:" + id + ":date");
j.del("uid:" + id + ":likes");
Graph store
• Use nodes, relationships between nodes, and
key-value properties
• Access data using graph traversal, navigating
from start nodes to related nodes according to
graph algorithms
• Faster for associative data sets
• Use cases: storing and reasoning on complex
and connected data, such as inferencing
applications in healthcare, government, telecom,
oil, performing closure on social networking
graphs
Connection
private static final String DB_PATH =
"C:/neo4j-community-1.8.2/data/graph.db";

private static enum RelTypes implements RelationshipType {


LIKES
}
...
graphDb =
new GraphDatabaseFactory().newEmbeddedDatabase(DB_PATH);
registerShutdownHook(graphDb);

System.out.println("Connected to Neo4j");
Create
Transaction tx = graphDb.beginTx();

try {
firstNode = graphDb.createNode();
firstNode.setProperty("name", "akmal");
firstNode.setProperty("age", 40);
firstNode.setProperty("date", new Date().toString());
secondNode = graphDb.createNode();
secondNode.setProperty("food", "satay, kebabs, fish-n-chips");
relationship = firstNode.createRelationshipTo(secondNode,
RelTypes.LIKES);
relationship.setProperty("likes", "likes");
tx.success();
} finally { tx.finish(); }
Read
Transaction tx = graphDb.beginTx();

try {
print("name", firstNode.getProperty("name"));
print("age", firstNode.getProperty("age"));
print("date", firstNode.getProperty("date"));
print("likes", secondNode.getProperty("food"));
tx.success();
} finally { tx.finish(); }
Update
Transaction tx = graphDb.beginTx();

try {
firstNode.setProperty("age", 29);
tx.success();
} finally { tx.finish(); }
Delete
Transaction tx = graphDb.beginTx();

try {
firstNode.getSingleRelationship(RelTypes.LIKES,
Direction.OUTGOING).delete();
firstNode.delete();
secondNode.delete();
tx.success();
} finally { tx.finish(); }
NoSQL use cases ...
• Online/mobile gaming
– Leaderboard (high score table) management
– Dynamic placement of visual elements
– Game object management
– Persisting game/user state information
– Persisting user generated data (e.g. drawings)
• Display advertising on web sites
– Ad Serving: match content with profile and present
– Real-time bidding: match cookie profile with advert
inventory, obtain bids, and present advert
NoSQL use cases
• Dynamic content management and publishing
(news and media)
– Store content from distributed authors, with fast
retrieval and placement
– Manage changing layouts and user generated content
• E-commerce/social commerce
– Storing frequently changing product catalogs
• Social networking/online communities
• Communications
– Device provisioning
Use case requirements ...
• Schema flexibility and development agility
– Application not constrained by fixed pre-defined
schema
– Application drives the schema
– Ability to develop a minimal application rapidly, and
iterate quickly in response to customer feedback
– Ability to quickly add, change or delete “fields” or
data-elements
– Ability to handle mix of structured, unstructured data
– Easier, faster programming, so faster time to market
and quick to adapt
Use case requirements ...
• Consistent low latency, even under high load
– Typically milliseconds or sub-milliseconds, for reads
and writes
– Even with millions of users
• Dynamic elasticity
– Rapid horizontal scalability
– Ability to add or delete nodes dynamically
– Application transparent elasticity, such as automatic
(re)distribution of data, if needed
– Cloud compatibility
Use case requirements
• High availability
– 24 x 7 x 365 availability
– (Today) Requires data distribution and replication
– Ability to upgrade hardware or software without any
down time
• Low cost
– Commonly available hardware
– Lower cost software, such as open source or pay-per-
use in cloud
– Reduced need for database admin and maintenance
Security and
vulnerability
Security

SQL

Source: Shutterstock Image ID 134699780


NoSQL databases threat model
1. Transactional integrity
2. Lax authentication mechanisms
3. Inefficient authorization mechanisms
4. Susceptibility to injection attacks
5. Lack of consistency
6. Insider attacks

Source: “Expanded Top Ten Big Data Security and Privacy Challenges” CSA (April 2013)
MongoDB security
The most effective way to reduce risk for
MongoDB deployments is to run your
entire MongoDB deployment, including all
MongoDB components (i.e. mongod,
mongos and application instances) in a
trusted environment.

Source: http://docs.mongodb.org/v2.4/MongoDB-security-guide.pdf (October 2014)


CouchDB security
When you start out fresh, CouchDB allows
any request to be made by anyone ...
While it is incredibly easy to get started
with CouchDB that way, it should be
obvious that putting a default installation
into the wild is adventurous. Any rogue
client could come along and delete a
database. relax

Source: http://guide.couchdb.org/draft/security.html (2 August 2015)


Redis security
Redis is designed to be accessed by
trusted clients inside trusted environments.
This means that usually it is not a good
idea to expose the Redis instance directly
to the internet or, in general, to an
environment where untrusted clients can
directly access the Redis TCP port or
UNIX socket.

Source: http://redis.io/topics/security/ (2 August 2015)


Well-known ports
Product Ports
MongoDB 27017 28017 27080
CouchDB 5984
HBase 9000
Cassandra 9160
Neo4j 7474
Redis 6379
Riak 8098

Source: “Abusing NoSQL Databases” Ming Chow (2013)


Shodan port example
NoSQL injection attacks ...
• NoSQL systems are
vulnerable
• Various types of
attacks
• Understand the
vulnerabilities and
consequences
NoSQL injection attacks
• Popular NoSQL
products will attract
more interest and
scrutiny
• Features of some
programming
languages, e.g. PHP
• Server-Side
JavaScript (SSJS)
NoSQL injection testing
• NoSQLMap project
– Open source proof-of-concept Python tool
– Automates injection attacks
– Exploits MongoDB vulnerabilities
– Future support for other NoSQL databases
Polyglot
persistence

Source: Heroku, used with permission


Polyglot persistence
User Sessions Financial Data Shopping Cart Recommendations

Product Catalog Reporting Analytics User Activity Logs

Source: Adapted from http://martinfowler.com/bliki/PolyglotPersistence.html


But ...
What have you built?
• Did you just pick things at random?
• Why is Redis talking to MongoDB?
• Why do you even use MongoDB?

Source: After https://twitter.com/codinghorror/status/347070841059692545/


Polyglot persistence examples
• Disney
– Cassandra, Hadoop, MongoDB
• Interactive Mediums
– CouchDB, MySQL
• Mendeley
– HBase, MongoDB, Solr, Voldemort
• Netflix
– Cassandra, Hadoop/HBase, RDBMS, SimpleDB
• Twitter
– Cassandra, FlockDB, Hadoop/HBase, MySQL
Polyglot persistence
• NoSQL product specialization requires
developer knowledge and skills for each platform
• Different APIs
– Develop public API for each NoSQL store (Disney)
Public API for NoSQL store
In some cases, the team decided to hide
the platform’s complexity from users; not
to facilitate its use, but to keep loose-
cannon developers from doing something
crazy that could take down the whole
cluster. It could show them all the controls
and knobs in a NoSQL database, but “they
tend to shoot each other,” Jacob said.
“First they shoot themselves, then they
shoot each other.”
Source: “How Disney built a big data platform on a startup budget” Derrick Harris (2012)
Asynchronous message
passing
(Actors) (Actors)

Module 1 Module 3
Graph-structured Document
domain rules structures
Columnar data Document structures
Access with with offline
decentralization processing
Module 2 Module 4

Source: Debasish Ghosh, used with permission


Multi-paradigm example
• Application that routes picking baskets for
inventory in a warehouse
• A graph with bins of inventory (nodes) along
aisles (edges)
• Store graph in Neo4j for performance
• Asynchronously persist in MySQL for reporting
• Move data using asynchronous message queue
• Faster performance, easier development,
simpler scaling, and reduced cost

Source: http://akfpartners.com/techblog/2011/06/21/multi-paradigm-data-storage-architectures/
Polyglot persistence with
EclipseLink JPA
• Java Persistence API (JPA) for access to
NoSQL systems
• Annotations and XML to identify stored NoSQL
entities
• An application can use multiple database
systems
• Single composite Persistence Unit (PU) supports
relational and non-relational data
• Support for MongoDB and Oracle NoSQL with
other products planned
Benchmarks and
performance
Yahoo Cloud Serving BM ...
• Originally Tested Systems
– Cassandra, HBase, Yahoo!’s PNUTS, sharded
MySQL
• Tier 1 (performance)
– Latency by increasing the server load
• Tier 2 (scalability)
– Scalability by increasing the number of servers
Yahoo Cloud Serving BM
• Yahoo Cloud Serving
Benchmark (YCSB)
– Research paper
– Slide deck
• Various reports
– See resources
2015 YCSB results ...
2015 YCSB results
Redis customer benchmark

Source: https://redislabs.com/blog/nosql-performance-aerospike-cassandra-datastax-couchbase-redis
How many servers to get 1M
writes/sec on GCE?

Source: http://www.slideshare.net/imcsummit/imcs2015-2-bus4-myth-about-inmemory-databases-busted/
Multi-model benchmark

Source: https://www.arangodb.com/2015/06/how-an-open-source-competitive-benchmark-helped-to-
improve-databases/
But ...
... any person who designs a benchmark is
in a ‘no win’ situation, i.e. he can only be
criticized. External observers will find fault
with the benchmark as artificial or
incomplete in one way or another.
Vendors who do poorly on the benchmark
will criticize it unmercifully.
-- Mike Stonebraker

Source: “Readings in Database Systems” 1st Edition (1988)


“Can the Elephants Handle the
NoSQL Onslaught?”
• DSS Workload (TPC-H)
– Hive vs. Parallel Data Warehouse
• Modern OLTP Workload (YCSB)
– MongoDB vs. SQL Server
• Conclusions
– NoSQL systems are behind relational systems in
performance
Linked Data Benchmark Council

• EU-funded project
• Develop Graph and RDF benchmarks
Stress testing
• Jepsen project
– Rigorously test how various database systems handle
partitions
– Evaluate consistency
• Conclusions
– Don’t rely on vendor marketing, product
documentation or “pull the plug” test
BI/Analytics
Architectures
• NoSQL reports
• NoSQL thru and thru
• NoSQL + MySQL
• NoSQL as ETL source
• NoSQL programs in BI tools
• NoSQL via BI database (SQL)

Source: Nicholas Goodman


NoSQL via BI database (SQL)

LIVE OR CACHED
PENTAHO.PRPT

local_
ALL_CONTRACTS 15 min ALL_CONTRACTS

VIEWS DOCS
view: "all"
javascript, map, reduce

Source: http://www.nicholasgoodman.com/bt/blog/2011/06/22/sql-access-to-couchdb-views-easy-reporting/
NoSQL alternatives
 451  Research:  Data  Plajorms  Landscape  Map  –  September  2014  
1   2   3   4   5   6  
Towards  
enterprise  search   Apache  Storm   Google    
SQLStream   Mortar   Treasure   Compute   AWS   Microsoo  
Apache  S4   HDInsight  
Lucene/Solr   DataTorrent   Data   Qubole   Data   Engine   EMR   Metascale  
Feedzai   Infochimps  
Metamarkets   T-­‐Systems   Zehaset   MapR   IBM     Hortonworks   Databricks/Spark  
A Towards   Sooware  AG   IBM   A
SRCH2   Al5scale  
E-­‐discovery   BigInsights   Oracle  Big  Data    
Guavus   InfoSphere  
Streams   Savvis  
Cloudera  
Appliance   ©  2014  by  451  Research  
HP  
Autonomy  
Elas5csearch   Lokad  
Google   TIBCO   Soolayer  
Cloud     StreamBase  
Rackspace  
Non-­‐relaNonal  zone   LLC.  All  rights  reserved    
Oracle     Azure   Dataflow   Verizon  
Apache   Apache  Apache   IBM  
Endeca  Server   Amvio   Search    
IBM   I nfoSphere    
AWS   xPlenty  
Kinesis   Trafodion   Splice  Machine  Tajo   Hive   Drill   Big  SQL   CitusDB   Pivotal  HD   SciDB   HPCC   Key:    
Towards  
Data  Explorer   NGDATA  Starcounter   SQLite   MammothDB   Presto   Impala   JethroData   Hadapt   Teradata     RainStor   General  purpose  
SIEM   Firebird  
Loggly   LucidWorks   Ac5an  Ingres   Aster   Specialist  analy5c  
Sumo   Big  Data   IBM  PureData  
B
Logentries  
TIBCO  
Logic   SAP  Sybase  ASE  
SAP  Sybase  SQL  Anywhere   RelaNonal  zone   for  Analy5cs  
B
-­‐as-­‐a-­‐Service  
BigTables  
LogLogic   EnterpriseDB   SQream  
Postgres-­‐XL   Microsoo   Graph  
Splunk   vFabric  Postgres   Oracle   IBM     SAP     SQL  Server   Oracle  
PostgreSQL   Exadata   PureData   HANA   PDW   Teradata   Exaly5cs   Document  
Sqrrl   Percona   MySQL   Key  value  stores  
MarkLogic   Enterprise     Ac5an   XtremeData  
MariaDB   MariaDB   Oracle   IBM   Informix   SQL     PSQL   Key  value  direct    
OrientDB   ArangoDB   Enterprise   Database   DB2   Progress   Metamarkets   Druid   access  
Exasol   Server   OpenEdge   Ac5an   V ector  
Ipedo  XML   Aerospike     ScaleDB   Hadoop  
Oracle  TimesTen   Kogni5o  
Database   MySQL  Cluster   MySQL     Spider  
Founda5onDB   VoltDB   Clustrix   Fabric   solidDB   MySQL  ecosystem  
Tamino   LucidDB  
FairCom   GenieDB   AWS  RDS   Advanced    
XML  Server   DataStax   Handlersocket   NuoDB   InfiniDB   OpenStack  Trove   Kx  Systems  
C Enterprise   InfiniSQL   ScaleBase   Ac5an  Matrix   C clustering/sharding  
Documentum   ScaleArc   Heroku  Postgres  
xDB   Infobright   IBM   I nfoSphere   New  SQL  databases  
YarcData   Datomic   Drizzle   Rackspace   ParStream  
Cassandra   Riak   SAP   Sybase  IQ  
UniData   Neo4J   FatDB   MemSQL   Tesora   Cloud   D atabases   Data  caching  
UniVerse   Hypertable   Couchbase   Tokutek   Google  Cloud  SQL   HP   Ver5ca  
Sparksee   JustOneDB   Pivotal  Greenplum   Data  grid  
Redis   CodeFutures   HP   C loud   R DB  
HBase   TransLamce  
Adabas  
IBM  IMS  
Accumulo  
FlockDB  
JumboDB  
Voldemort   Pivotal  SQLFire  
Con5nuent   114    for  MySQL  
FathomDB  
MonetDB  
LogicBlox  
Search  

GrapheneDB   RethinkDB   Al5base  HDB   Zimory  Scale   DeepDB   Appliances  


SQL  Azure   SpaceCurve  
WakandaDB   Cassandra.io   Oracle  NoSQL   WebScaleSQL   In-­‐memory  
BerkeleyDB   Al5base  XDB   Galera   AWS  
App  Engine   ClearDB   Stream  processing  
D
ObjectStore   Datastore   CouchDB   Redshio   D
Google   LevelDB   StormDB   InfluxDB   1010data  
Cloud      RavenDB   Database.com  
McObject   Datastore   HyperDex   Google     TempoIQ   BitYota  
Stardog   ObjectRocket   BigQuery  
Redis  
Titan   RedisGreen   Redis  Labs   AWS  
Ac5an   AffinityDB   Redis-­‐to-­‐go   CloudBird   Memcached  Cloud   IronCache   Elas5Cache  
Versant    
InterSystems  
Trinity   AWS  
SPARQLBASE   Elas5Cache  
MongoDB   MemCachier  
Compose  
Grid/cache  zone  
Caché   Giraph   with  Redis  
Redis   L abs   Iris  Couch   Ehcache   InfiniSpan  
Allegrograph   MongoLab   BigMemory  
E HypergraphDB   Redis  
MagnetoDB  
Cloud  
ObjectRocket   Memcached   Red  Hat  
E
Objec5vity   InfiniteGraph   SimpleDB   DocumentDB   IBM     JBoss  
Cloudant   ScaleOut   Pivotal   TIBCO   Oracle     eXtreme   Data  Grid  
AWS  DynamoDB   Sooware   GemFire   Ac5veSpaces   Coherence   Scale  
Lotus  Notes  
Source:1   451 Research,2   used with permission
3  
GridGain   GigaSpaces  XAP   Hazelcast  
4  
CloudTran  
5   6  
NewSQL
• Today, new challenges and requirements
– “Web changes everything”
• Need more OLTP throughput
• Need real-time analytics
• ACID support
• Preserve SQL
– Automatic query optimization
• Preserve investment
– Existing skills and tools
Connection
Class.forName("com.nuodb.jdbc.Driver");

Properties properties = new Properties();

properties.put("user", "dba");
properties.put("password", "goalie");
properties.put("schema", "test");

connection = DriverManager.getConnection(
"jdbc:com.nuodb://localhost/test", properties);

System.out.println("Connected to NuoDB");
Create
PreparedStatement statement = connection.prepareStatement(
"INSERT INTO people (name, age, date, likes) VALUES (?, ?, ?, ?)");

statement.setString(1, "akmal");
statement.setInt(2, 40);
statement.setString(3, new Date().toString());
statement.setString(4, "satay kebabs fish-n-chips");
statement.addBatch();
statement.executeBatch();
connection.commit();
Read
String query = "SELECT * FROM people;";

Statement statement = connection.createStatement();


ResultSet cursor = statement.executeQuery(query);

while (cursor.next()) {
System.out.print(cursor.getString(1) + " ");
System.out.print(cursor.getInt(2) + " ");
System.out.print(cursor.getString(3) + " ");
System.out.println(cursor.getString(4));
}

cursor.close();
statement.close();
Update
String query =
"UPDATE people SET age = 29 WHERE name = 'akmal';";

Statement statement = connection.createStatement();


statement.executeUpdate(query);
connection.commit();
readData(connection);
Delete
String query = "DELETE FROM people WHERE name = 'akmal';";

Statement statement = connection.createStatement();


statement.executeUpdate(query);
connection.commit();
Relational
• Vendors adding
NoSQL capabilities
– Documents (JSON)
– Linked data (RDF)
Relational vs. XML vs. RDF
Relational XML RDF

Tables Trees Graphs

Flat, highly structured Hierarchical data Linked data

Rows in a table Nodes in a tree Triples describe links

Fixed schema No or flexible schema Highly flexible

SQL (ANSI/ISO) XPath/XQuery (W3C) SPARQL (W3C)


What about Oracle?
The meme changed (again)

Not
Only SQL No, SQL
The rise of SQL ...
First they ignore you, then they laugh at
you, then they fight you, then you win.
-- Mahatma Gandhi (disputed)

Source: http://en.wikiquote.org/wiki/Mahatma_Gandhi
The rise of SQL
Name Example

AQL FOR ... IN ... FILTER ... RETURN

CQL SELECT ... FROM ... WHERE ...

SQL for
SELECT ... FROM ... WHERE ...
Documents

db.collection.find( { ... } )
Summary
“The Time Tunnel”

Source: Shutterstock Image ID 135864122


Source: ParElastic, used with permission
History repeats
Those who cannot remember the past are
condemned to repeat it.
-- George Santayana

Source: “Reason in Common Sense” of “The Life of Reason” George Santayana (1905)
Relational does NoSQL
Often the overhead of managing data in
multiple databases is more than the
advantages of the other store being faster.
You can do “NoSQL” inside and around a
hackable database like PostgreSQL, not
just as a separate one.
-- Hannu Krosing

Source: https://2013.nosql-matters.org/cgn/wp-content/uploads/2013/02/
PostSQL_at_noSQLmatters-1slide.pdf
“MySQL is web scale”

• Collaboration between Alibaba, Facebook,


Google, LinkedIn and Twitter
• Adding more features to MySQL, specific to
deployments in large-scale environments
Structured vs. unstructured

Structured Unstructured
Relational vs. NoSQL toolbox
Relational vs. NoSQL ...
It is specious to compare NoSQL
databases to relational databases; as
you’ll see, none of the so-called “NoSQL”
databases have the same implementation,
goals, features, advantages, and
disadvantages. So comparing “NoSQL” to
“relational” is really a shell game.
-- Eben Hewitt

Source: “Cassandra: The Definitive Guide” Eben Hewitt (2010)


Relational vs. NoSQL

Source: Getty Image ID WCO_016


Choices, choices
Navigating the DB universe
Fast
Complex
Large
Value of Individual Data Item Aggregate Data Value
Application Complexity

Data Value
Velocity Hadoop, etc.
NoSQL
Data
NewSQL Warehouse
Traditional RDBMS
Simple
Slow
Small
Transactional Analytic
Real-time Historical Exploratory
Interactive Record Lookup
Analytics Analytics Analytics

Source: VoltDB, used with permission


Understand your use case

Source: http://www.techvalidate.com/tvid/F66-11B-178/
Understand vendor-speak
What vendor says What vendor means

The biggest in the world The biggest one we’ve got

The biggest in the universe The biggest one we’ve got

There is no limit to ... It’s untested, but we don’t mind if you


try it
A new and unique feature Something the competition has had for
ages
Currently available feature We are about to start Beta testing

Planned feature Something the competition has, that we


wish we had too, that we might have one
day
Highly distributed International offices

Engineered for robustness Comes in a tough box

Source: “Object Databases: An Evaluation and Comparison” Bloor Research (1994)


Vendor marketing example
Really, really effective marketing masks
MongoDB’s shortcomings...
-- Robert Roland

Source: http://www.slideshare.net/cloudera/case-studies-session-5b/
“Foundation”
... there is a branch of human knowledge
known as symbolic logic ... When Holk,
after two days of steady work, succeeded
in eliminating meaningless statements,
vague gibberish, useless qualifications --
in short, all the goo and dribble -- he found
he had nothing left. Everything canceled
out.
-- Isaac Asimov
Source: “Foundation” Isaac Asimov (1951)
Understand the risks
The great debate ...

Source: Getty Image ID WCO_011


The great debate ...
About every ten years or so, there is a
“great debate” between, on the one hand,
those who see the problem of data
modelling through a more or less relational
lens, and on the other, a noisier set of
“refuseniks” who have a hot new thing to
promote. The debate usually goes like
this:
The great debate ...
Refuseniks: Hah! You relational people
with your flat tables and silly query
languages! You are so unhip! You simply
cannot deal with the problem of [INSERT
NEW THING HERE]. With an [INSERT
NEW THING HERE]-DBMS we will finish
you, and grind your bones into dust!
The great debate
R-people: You make some good points.
But unfortunately a) there is an enormous
amount of money invested in building
scalable, efficient and reliable database
management products and no one is going
to drop all of that on the floor and b) you
are confusing DBMS engineering
decisions with theoretical questions. We
plan to incorporate the best of these ideas
into our products.
Source: Paul Brown
The problem is not the tool itself

Source: CommitStrip, used with permission


It’s the people ...
... MongoDB Day London ... the problem is
the people! They all talk like this:
1. Some problem that just doesn’t really
exist (or hasn’t existed for a very long
time) with relational databases
2. MongoDB
3. Profit!
-- Gaius Hammond

Source: http://gaiustech.wordpress.com/2013/04/13/mongodb-days/
It’s the people
... most of the business people driving the
Big Data NoSQL databases are data
management illiterate; don’t recognize the
lack of NoSQL data management
facilities ... and don’t know anything about
availability, referential integrity and
normalized data designs.
-- Dave Beulke

Source: http://davebeulke.com/big-data-day-recap/
Don’t be a Lemming

Source: Shutterstock Image ID 34566709


Limitations of NoSQL
• Lack of standardized or well-defined semantics
– Transactions? Isolation levels?
• Reduced consistency for performance and
scalability
– “Eventual consistency”
– “Soft commit”
• Limited forms of access, e.g. often no joins, etc.
• Proprietary interfaces
• Large clusters, failover, etc.?
• Security?
Future directions
• Internal polyglot support
• Multi-model systems
• Google F1-inspired systems
– “Can you have a scalable database without going
NoSQL? Yes.”
• Further support for NoSQL in Relational
• DBaaS
• Orchestrate.io
– “The Next Big Thing”?
Final thoughts
We are clearly in the phase of a new
technology adoption in which the category
is hyped, its benefits over-promised, its
limitations poorly understood, and its value
oversold.
-- Tim Berglund

Source: “Saying Yes to NoSQL” Tim Berglund (2011)


There will be harmony

Source: Shutterstock Image ID 73418620


Contact details
Find me on
– http://www.linkedin.com/in/akmalchaudhri/
– http://twitter.com/akmalchaudhri/
– http://www.quora.com/Akmal-Chaudhri/
– http://www.facebook.com/akmal.chaudhri/
– http://plus.google.com/+AkmalChaudhri/
– http://www.slideshare.net/VeryFatBoy/
– http://www.youtube.com/VeryFatBoyVideos/
Akmal B. Chaudhri
[email protected]
{"thank":"You"}
Resources
History ...
• First NoSQL meetup
– http://nosql.eventbrite.com/
– http://blog.oskarsson.nu/post/22996139456/nosql-
meetup
• First NoSQL meetup debrief
– http://blog.oskarsson.nu/post/22996140866/nosql-
debrief
• First NoSQL meetup photographs
– http://www.flickr.com/photos/russss/sets/
72157619711038897/
History
• Codd’s Relational Vision -- Has NoSQL Come
Full Circle?
– http://www.opensourceconnections.com/2013/12/11/
codds-relational-vision-has-nosql-come-full-circle/
NoSQL Search roadshow
• Multi-city tour 2013
– Munich
– Berlin
– San Francisco
– Copenhagen
– Zurich
– Amsterdam
– London
Web sites
• NoSQL Databases and Polyglot Persistence: A
Curated Guide
– http://nosql.mypopescu.com/
• NoSQL: Your Ultimate Guide to the Non-
Relational Universe!
– http://nosql-database.org/
Free books ...

• Data Access for Highly-Scalable Solutions: Using SQL,


NoSQL, and Polyglot Persistence
– http://www.microsoft.com/en-us/download/details.aspx?id=40327
• Getting Started with Oracle NoSQL Database
– http://books.mcgraw-hill.com/ebookdownloads/NoSQL/
Free books ...

• Enterprise NoSQL for Dummies


– http://www.nosqlfordummies.com/
• Graph Databases
– http://www.graphdatabases.com/
Free books ...

• The Little MongoDB Book


– http://openmymind.net/mongodb.pdf
• The Little Redis Book
– http://openmymind.net/redis.pdf
Free books ...

• CouchDB: The Definitive Guide


– http://guide.couchdb.org/
• A Little Riak Book
– https://github.com/coderoshi/little_riak_book/
Free books ...

• Understanding The Top 5 Redis Performance Metrics


– https://www.datadoghq.com/wp-content/uploads/2013/09/
Understanding-the-Top-5-Redis-Performance-Metrics.pdf
• DBA’s Guide to NoSQL
– https://www.smashwords.com/books/view/479798/
Free books

• Mastering Hazelcast
– http://hazelcast.com/resources/mastering-hazelcast/
• Fast Data and the New Enterprise Data Architecture
– http://voltdb.com/fast-data-and-new-enterprise-data-architecture/
Free training ...
CERTIFICATE CERTIFICATE
Dec. 24th, 2012 Dec. 24th, 2012

This is to certify that This is to certify that

Akmal Chaudhri Akmal Chaudhri


successfully completed successfully completed

M101: MongoDB for Developers M102: MongoDB for DBAs


a course of study offered by 10gen, The MongoDB Company a course of study offered by 10gen, The MongoDB Company

Dwight Merriman Andrew Erlichson Dwight Merriman Andrew Erlichson


Vice President, Education Vice President, Education
10gen, Inc. 10gen, Inc. 10gen, Inc. 10gen, Inc.

Authenticity of this certificate can be verified at https://education.10gen.com/downloads/certificates/1e73378509f046f28cbcb2212f3d7cff/Certificate.pdf Authenticity of this certificate can be verified at https://education.10gen.com/downloads/certificates/c0e418e393e247eb818d82d0472549f4/Certificate.pdf

• MongoDB
– https://education.mongodb.com/
Free training
• Aerospike
– http://www.aerospike.com/training/<administration |
development>/online/
• Cassandra
– https://academy.datastax.com/
• Neo4j
– http://www.neo4j.org/learn/online_course/
• OrientDB
– http://www.orientechnologies.com/getting-started/
Articles ...
• Saying Yes to NoSQL
– http://www.nofluffjuststuff.com/s/magazine/
NFJS_theMagazine_Vol3_Issue3_May2011.pdf
• The State of NoSQL
– http://www.infoq.com/articles/State-of-NoSQL/
• An Introduction to NoSQL Patterns
– http://architects.dzone.com/articles/introduction-nosql-
patterns/
Articles
• Why is the NoSQL choice so difficult?
– http://www.itworld.com/big-data/428051/why-nosql-
choice-so-difficult/
• NoSQL is a no go once again
– http://www.itworld.com/big-data/428717/nosql-no-go-
once-again/
• The NoSQL Advice I Wish Someone Had Given
Me
– http://sql.dzone.com/articles/nosql-advice-i-wish-
someone/
Free reports ...
• A deep dive into NoSQL: A complete list of
NoSQL databases
– http://www.bigdata-madesimple.com/a-deep-dive-into-
nosql-a-complete-list-of-nosql-databases/
• Deconstructing NoSQL
– http://whitepapers.dataversity.net/content37165/
• The DZone Guide to Database and Persistence
Management
– http://www.dzone.com/research/guide-to-databases/
Free reports ...
• 2013 Gartner Magic Quadrant for Operational
Database Management Systems
– http://www.aerospike.com/only-visionary-in-gartner-
mq-2013/
• 2014 Gartner Magic Quadrant for Operational
Database Management Systems
– http://www.datastax.com/gartner-magic-quadrant-
odbms
Free reports ...
• Gartner: Five Data Persistence Dilemmas That
Will Keep CIOs Up at Night
– http://www1.memsql.com/gartner-cio-report/
Free reports ...
• The Forrester Wave™: NoSQL Key-Value
Databases, Q3 2014
– https://www.mapr.com/forrester-wave-hadoop-nosql-
key-value-databases
• The Forrester Wave™: NoSQL Document
Databases, Q3 2014
– http://info.marklogic.com/forrester-wave.html
• Forrester Ranks the NoSQL Database Vendors
– http://www.datanami.com/2014/10/03/forrester-ranks-
nosql-database-vendors/
Free reports
• The Real World of
The Database
Administrator
– https://
software.dell.com/
whitepaper/the-real-
world-of-the-database-
administrator-875469/
White papers
• The CIO’s Guide to
NoSQL
– http://
documents.dataversity
.net/whitepapers/the-
cios-guide-to-
nosql.html
Vendor funding ...
• Visualizing the $1bn+ VC investment in Hadoop
and NoSQL
– http://blogs.the451group.com/
information_management/2013/12/17/visualizing-
the-1bn-vc-investment-in-hadoop-and-nosql/
• Hadoop vs. NoSQL -- Which Big Data
Technology Has Raised More Funding?
– http://www.cbinsights.com/blog/hadoop-nosql-
venture-capital-funding/
Vendor funding
• The NoSQLNow conference in San Jose this
week
– http://swtrends.wordpress.com/2014/08/22/the-
nosqlnow-conference-in-san-jose-this-week/
• NoSQL market frames larger debate: Can open
source be profitable?
– http://siliconangle.com/blog/2015/03/19/nosql-market-
frames-larger-debate-can-open-source-be-profitable/
Brewer’s CAP “Theorem” ...
• Towards Robust Distributed Systems
– http://www.cs.berkeley.edu/~brewer/cs262b-2004/
PODC-keynote.pdf
• Deconstructing the ‘CAP theorem’ for CM and
DevOps
– http://markburgess.org/blog_cap.html
• NoCAP Or, Achieving Scalability Without
Compromising on Consistency
– http://www.gigaspaces.com/system/files/private/
resource/NoCAPfinal0711.pdf
Brewer’s CAP “Theorem” ...
• Brewer’s CAP Theorem
– http://www.julianbrowne.com/article/viewer/brewers-
cap-theorem
• Confused CAP Arguments
– http://www.stucharlton.com/blog/archives/2010/10/
confused-cap-arguments.html
• Please stop calling databases CP or AP
– https://martin.kleppmann.com/2015/05/11/please-
stop-calling-databases-cp-or-ap.html
Brewer’s CAP “Theorem”
• The CAP theorem series
– http://blog.thislongrun.com/2015/03/the-cap-theorem-
series.html
Data consistency
• Replicated Data Consistency Explained Through
Baseball
– http://research.microsoft.com/apps/pubs/
default.aspx?id=206913
• Distributed Algorithms in NoSQL Databases
– https://highlyscalable.wordpress.com/2012/09/18/
distributed-algorithms-in-nosql-databases/
Product selection ...
• 101 Questions to Ask When Considering a
NoSQL Database
– http://highscalability.com/blog/2011/6/15/101-
questions-to-ask-when-considering-a-nosql-
database.html
• 35+ Use Cases for Choosing Your Next NoSQL
Database
– http://highscalability.com/blog/2011/6/20/35-use-
cases-for-choosing-your-next-nosql-database.html
Product selection ...
• NoSQL Data Modeling Techniques
– http://highlyscalable.wordpress.com/2012/03/01/
nosql-data-modeling-techniques/
• Choosing a NoSQL data store according to your
data set
– http://00f.net/2010/05/15/choosing-a-nosql-data-store-
according-to-your-data-set/
• The Right Database for Your Use Case
– http://mpron.github.io/the-right-database-for-your-use-
case/
Product selection ...
• NoSQL Options Compared: Different Horses for
Different Courses
– http://www.slideshare.net/tazija/nosql-options-
compared/
• The NoSQL Technical Comparison Report:
Cassandra (DataStax), MongoDB, and
Couchbase Server
– http://www.altoros.com/nosql-tech-comparison-
cassandra-mongodb-couchbase.html
Product selection ...
• The Solutions Architect’s Guide to Choosing a
(NoSQL) Data Store
– http://bogdanbocse.com/2014/12/the-solutions-
architects-guide-to-choosing-a-nosql-data-store-
process-overview/
– http://bogdanbocse.com/2014/12/the-solutions-
architects-guide-to-choosing-a-nosql-data-store-
analyze-the-requirements-of-your-ideal-solutions/
Product selection
• Design Assistant for NoSQL Technology
Selection
– http://dl.acm.org/citation.cfm?id=2751494
Short product overviews
• Cassandra vs MongoDB vs CouchDB vs Redis
vs Riak vs HBase vs Couchbase vs Neo4j vs
Hypertable vs ElasticSearch vs Accumulo vs
VoltDB vs Scalaris comparison
– http://kkovacs.eu/cassandra-vs-mongodb-vs-
couchdb-vs-redis/
• vsChart.com
– http://vschart.com/list/database/
Case studies ...
• Real World NoSQL: HBase at Trend Micro
– http://gigaom.com/cloud/real-world-nosql-hbase-at-
trend-micro/
• Real World NoSQL: MongoDB at Shutterfly
– http://gigaom.com/cloud/real-world-nosql-mongodb-
at-shutterfly/
• Real World NoSQL: Cassandra at Openwave
– http://gigaom.com/cloud/realworld-nosql-cassandra-
at-openwave/
Case studies ...
• Real World NoSQL: Amazon SimpleDB at Netflix
– http://gigaom.com/cloud/real-world-nosql-amazon-
simpledb-at-netflix/
• Real World NoSQL: Membase at Tribal Crossing
– http://gigaom.com/cloud/real-world-nosql-membase-
at-tribal-crossing/
• How Disney built a big data platform on a startup
budget
– http://gigaom.com/data/how-disney-built-a-big-data-
platform-on-a-startup-budget/
Case studies ...
• Choosing a NoSQL: A Real-Life Case
– http://www.slideshare.net/VolhaBanadyseva/10-ss-
choosing-a-nosql-database/
• From 1000/day to 1000/sec: The Evolution of
Incapsula’s BIG DATA System
– http://www.slideshare.net/Incapsula/surge2014/
• Providence: Failure Is Always an Option
– http://jasonpunyon.com/blog/2015/02/12/providence-
failure-is-always-an-option/
Case studies
• NoSQL Data Store Technologies
– http://www.dtic.mil/cgi-bin/GetTRDoc?
AD=ADA611676
NoSQL alternatives
• Etsy goes retro to scale big data
– http://www.techrepublic.com/article/etsy-goes-retro-to-
scale/
• Project Mezzanine: The Great Migration
– https://eng.uber.com/mezzanine-migration/
High-profile MySQL web sites
• Facebook
– http://www.mysql.com/customers/view/?id=757
• Twitter
– http://www.mysql.com/customers/view/?id=951
• Tumblr
– http://www.mysql.com/customers/view/?id=1186
• Wikipedia
– http://www.mysql.com/customers/view/?id=663
Negative NoSQL comments ...
• MongoDB is to NoSQL like MySQL to SQL -- in
the most harmful way
– http://use-the-index-luke.com/blog/2013-10/mysql-is-
to-sql-like-mongodb-to-nosql
• The Genius and Folly of MongoDB
– http://nyeggen.com/blog/2013/10/18/the-genius-and-
folly-of-mongodb/
• Why You Should Never Use MongoDB
– http://www.sarahmei.com/blog/2013/11/11/why-you-
should-never-use-mongodb/
Negative NoSQL comments ...
• Failing with MongoDB
– http://blog.schmichael.com/2011/11/05/failing-with-
mongodb/
– https://speakerdeck.com/robotadam/postgres-at-
urban-airship/
• A Year with MongoDB
– http://blog.engineering.kiip.me/post/20988881092/a-
year-with-mongodb/
– https://speakerdeck.com/mitsuhiko/a-year-of-
mongodb/
Negative NoSQL comments ...
• Why MongoDB Never Worked Out at Etsy
– http://mcfunley.com/why-mongodb-never-worked-out-
at-etsy/
• A post you wish to read before considering using
MongoDB for your next app
– http://longtermlaziness.wordpress.com/2012/08/24/a-
post-you-wish-to-read-before-considering-using-
mongodb-for-your-next-app/
Negative NoSQL comments ...
• Goodbye, CouchDB
– http://sauceio.com/index.php/2012/05/goodbye-
couchdb/
• Don’t use NoSQL
– https://speakerdeck.com/roidrage/dont-use-nosql/
– http://vimeo.com/49713827/
• The SQL and NoSQL Effects: Will They Ever
Learn?
– http://www.dbdebunk.com/2015/07/the-sql-and-nosql-
effects-will-they.html
Negative NoSQL comments ...
• Do Developers Use NoSQL Because They're
Too Lazy to Use RDBMS Correctly?
– http://architects.dzone.com/articles/do-developers-
use-nosql/
– http://gaiustech.wordpress.com/2013/04/13/mongodb-
days/
• The parallels between NoSQL and self-inflicted
torture
– http://www.parelastic.com/blog/parallels-between-
nosql-and-self-inflicted-torture/
Negative NoSQL comments
• 7 hard truths about the NoSQL revolution
– http://www.infoworld.com/article/2617405/nosql/7-
hard-truths-about-the-nosql-revolution.html
• Google goes back to the future with SQL F1
database
– http://www.theregister.co.uk/2013/08/30/
google_f1_deepdive/
• What’s left of NoSQL?
– http://use-the-index-luke.com/blog/2013-04/whats-left-
of-nosql
Gotchas ...
• Broken by Design: MongoDB Fault Tolerance
– http://hackingdistributed.com/2013/01/29/mongo-ft/
• Things they don’t tell you about MongoDB
– http://www.itexto.com.br/devkico/en/?p=44
• MongoDB Gotchas & How To Avoid Them
– http://rsmith.co/2012/11/05/mongodb-gotchas-and-
how-to-avoid-them/
Gotchas
• Top 5 syntactic weirdnesses to be aware of in
MongoDB
– http://devblog.me/wtf-mongo
• This Team Used Apache Cassandra... You
Won’t Believe What Happened Next
– http://blog.parsely.com/post/1928/cass/
NoSQL to Relational ...
• MongoDB to MySQL (Aadhar)
– http://techcrunch.com/2013/12/06/inside-indias-
aadhar-the-worlds-biggest-biometrics-database/
• MongoDB to MySQL (Diaspora)
– http://www.slideshare.net/sarahmei/taking-diaspora-
from-mongodb-to-mysql-rubyconf-2011/
• Redis to MySQL (OpenSource Connections)
– http://www.slideshare.net/AllThingsOpen/stop-
worrying-love-the-sql-a-case-study/
NoSQL to Relational ...
• MongoDB to PostgreSQL (Urban Airship)
– http://blog.schmichael.com/2011/11/05/failing-with-
mongodb/
• MongoDB to Postgres
– http://blog.testdouble.com/posts/2014-06-23-mongo-
to-postgres.html
• MongoDB to PostgreSQL (Errbit fork)
– https://github.com/errbit/errbit/issues/614/
NoSQL to Relational ...
• MongoDB to PostgreSQL (Olery)
– http://developer.olery.com/blog/goodbye-mongodb-
hello-postgresql/
• NoSQL to PostgreSQL (Revolv)
– http://technosophos.com/2014/04/11/nosql-no-
more.html
• MongoDB to NuoDB (DropShip Commerce)
– http://searchdatamanagement.techtarget.com/feature/
NewSQL-database-sends-NoSQL-technology-
packing-at-logistics-exchange
NoSQL to Relational
• RavenDB to SQL Server (Octopus)
– https://octopusdeploy.com/blog/3.0-switching-to-sql/
NoSQL to NoSQL ...
• MongoDB. This is not the database you are
looking for.
– http://patrickmcfadin.com/2014/02/11/mongodb-this-
is-not-the-database-you-are-looking-for/
• MongoDB to Couchbase (Viber)
– http://www.slideshare.net/Couchbase/
couchbasetlv2014couchbaseatviber/
• MongoDB to HBase (Simply Measured)
– http://www.slideshare.net/RobertRoland2/
rebuilding-22995359/
NoSQL to NoSQL ...
• MongoDB to Cassandra (MetaBroadcast)
– http://www.slideshare.net/fredvdd/mongodb-to-
cassandra/
• MongoDB to Cassandra (SHIFT)
– http://www.slideshare.net/DataStax/shift-real-world-
migration-from-mongo-db-to-cassandra-25970769/
• MongoDB to Cassandra (FullContact)
– http://www.fullcontact.com/blog/mongo-to-cassandra-
migration/
NoSQL to NoSQL ...
• MongoDB to Cassandra (Shodan)
– http://planetcassandra.org/blog/post/mongodb-to-
cassandra-a-developers-story/
• MongoDB to Cassandra (Retailigence)
– http://planetcassandra.org/blog/post/retailigence-
turns-to-apache-cassandra-after-returning-mysql-and-
mongodb-for-scalable-location-based-shopping-api/
• MongoDB to Neo4j (Shindig)
– http://seenickcode.com/switching-from-mongodb-to-
neo4j/
NoSQL to NoSQL ...
• MongoDB to Cloudant (Postmark)
– http://blog.postmarkapp.com/post/37338222496/bye-
mongodb-hello-cloudant/
• MongoDB to DynamoDB (Gummicube)
– https://www.codementor.io/devops/tutorial/handling-
date-and-datetime-in-dynamodb/
• Cassandra to DynamoDB (Tellybug)
– http://attentionshard.wordpress.com/2013/09/30/why-
tellybug-moved-from-cassandra-to-amazon-
dynamodb/
NoSQL to NoSQL
• Redis to Cassandra (Instagram)
– http://planetcassandra.org/blog/post/cassandra-
summit-2013-instagrams-shift-to-cassandra-from-
redis-by-rick-branson/
Security ...
• Abusing NoSQL Databases
– https://www.defcon.org/images/defcon-21/dc-21-
presentations/Chow/DEFCON-21-Chow-Abusing-
NoSQL-Databases.pdf
• NoSQL, no security?
– http://www.slideshare.net/wurbanski/nosql-no-
security/
• NoSQL, No Injection!?
– http://www.slideshare.net/wayne_armorize/nosql-no-
sql-injections-4880169/
Security ...
• NoSQL, But Even Less Security
– http://blogs.adobe.com/asset/files/2011/04/NoSQL-
But-Even-Less-Security.pdf
• NoSQL Database Security
– http://pastconferences.auscert.org.au/conf2011/
presentations/Louis%20Nyffenegger%20V1.pdf
• Does NoSQL Mean No Security?
– http://www.darkreading.com/application-security/
database-security/does-nosql-mean-no-security/d/d-
id/1136913
Security ...
• A Response To NoSQL Security Concerns
– http://www.darkreading.com/application-security/
database-security/a-response-to-nosql-security-
concerns/d/d-id/1137044
• Mongodb -- Security Weaknesses in a typical
NoSQL database
– http://blog.spiderlabs.com/2013/03/mongodb-security-
weaknesses-in-a-typical-nosql-database.html
• Neo4j -- “Enter the GraphDB”
– http://blog.scrt.ch/2014/05/09/neo4j-enter-the-
graphdb/
Security
• More Data, More Problems: Part #1
– http://blog.imperva.com/2014/08/more-data-more-
problems-part-1.html
• More Data, More Problems: Part #2
– http://blog.imperva.com/2014/08/more-data-more-
problems-part-2.html
• More Data, More Problems: Part #3
– http://blog.imperva.com/2014/09/more-data-more-
problems-part-3.html
NoSQL injection testing ...
• NoSQLMap project
– http://nosqlmap.net
– https://github.com/tcstool/NoSQLMap/
• Making Mongo Cry: NoSQL for Penetration
Testers
– http://www.nosqlmap.net/DC22-WoS-
Nosql_slides.pptx
NoSQL injection testing ...
• NoSQL Exploitation Framework
– http://nosqlproject.com
• Pentesting NoSQL DB’s with NoSQL
Exploitation Framework
– https://www.hackinparis.com/node/267/
– http://www.slideshare.net/44Con/pentesting-nosql-
dbs-with-nosql-exploitation-framework/
NoSQL injection testing ...
• Does NoSQL Equal No Injection?
– http://securityintelligence.com/does-nosql-equal-no-
injection
• No SQL, No Injection? Examining NoSQL
Security
– http://arxiv.org/pdf/1506.04082v1
NoSQL injection testing ...
• Hacking NodeJS and MongoDB
– http://blog.websecurify.com/2014/08/hacking-nodejs-
and-mongodb.html
– http://java.dzone.com/articles/defending-against-
query/
• NoSQL SSJI Authentication Bypass
– http://blog.imperva.com/2014/10/nosql-ssji-
authentication-bypass.html
NoSQL injection testing
• Attacking MongoDB
– http://www.slideshare.net/cyber-punk/mongo-db-eng/
• Avoiding MongoDB hash-injection attacks
– http://cirw.in/blog/hash-injection
– https://github.com/eoftedal/HashInjection/
• No SQL injection but NoSQL Injection
– http://www.slideshare.net/sth4ck/sthack-2013-florian-
agixid-gaultier-no-sql-injection-but-no-sql-injection/
NoSQL forensics
• NoSQL Forensics: What to do with
(No)ARTIFACTS
– https://speakerdeck.com/505forensics/nosql-
forensics-what-to-do-with-no-artifacts/
• NoSQL Injections: Moving Beyond or ‘1’=‘1’
– https://speakerdeck.com/505forensics/nosql-
injections-moving-beyond-or-1-equals-1/
• NoSQL Triage Scripts
– https://github.com/505Forensics/nosql_triage/
NoSQL honeypot testing
• NoSQL Honeypot Framework (NoPo)
– https://github.com/torque59/nosqlpot/
Polyglot persistence ...
• NoSQL Database Choices: Weather Co. CIO’s
Advice
– http://www.informationweek.com/big-data/software-
platforms/nosql-database-choices-weather-co-cios-
advice/a/d-id/1317052
• Why we started using PostgreSQL with Slick
next to MongoDB
– http://www.plotprojects.com/why-we-use-postgresql-
and-slick/
Polyglot persistence ...
• HBase at Mendeley
– http://www.slideshare.net/danharvey/hbase-at-
mendeley/
• Polyglot Persistence
– http://www.slideshare.net/jwoodslideshare/polyglot-
persistence-two-great-tastes-that-taste-great-
together-4625004/
• Polyglot Persistence Patterns
– http://abhishek-tiwari.com/post/polyglot-persistence-
patterns/
Polyglot persistence
• Polyglot Persistence: EclipseLink with MongoDB
and Derby
– http://java.dzone.com/articles/polyglot-persistence-0/
• D. Ghosh (2010) Multiparadigm data storage for
enterprise applications. IEEE Software. Vol. 27,
No. 5, pp. 57-60
Performance benchmarks ...
• Yahoo Cloud Serving Benchmark
– http://research.yahoo.com/node/3202/
– http://altoros.com/nosql-research
– http://www.slideshare.net/tazija/evaluating-nosql-
performance-time-for-benchmarking/
– http://jaxenter.com/evaluating-nosql-performance-
which-database-is-right-for-your-data.1-49428.html
Performance benchmarks ...
• 2015 YCSB results
– http://info.couchbase.com/
Benchmark_MongoDB_VS_CouchbaseServer_B.html
– http://www.mongodb.com/lp/white-paper/benchmark-
report/
– http://www.datastax.com/apache-cassandra-leads-
nosql-benchmark
Performance benchmarks ...
• Rising NoSQL Star: Aerospike, Cassandra,
Couchbase or Redis?
– https://redislabs.com/blog/nosql-performance-
aerospike-cassandra-datastax-couchbase-redis
– Performance comparison between ArangoDB,
MongoDB, Neo4j and OrientDB
– https://www.arangodb.com/nosql-performance-blog-
series/
– https://github.com/weinberger/nosql-tests/
Performance benchmarks ...
• Performance Evaluation of NoSQL Databases: A
Case Study
– http://www.researchgate.net/publication/
275033854_Performance_Evaluation_of_NoSQL_Dat
abases_A_Case_Study
• A Case Study for NoSQL Applications and
Performance Benefits: CouchDB vs. Postgres
– http://figshare.com/articles/
A_Case_Study_for_NoSQL_Applications_and_Perfor
mance_Benefits_CouchDB_vs_Postgres/787733
Performance benchmarks ...
• Ultra-High Performance NoSQL Benchmarking
– http://thumbtack.net/whitepapers/ultra-high-
performance-nosql-benchmark.html
• Benchmarking Top NoSQL Databases
– http://www.datastax.com/resources/whitepapers/
benchmarking-top-nosql-databases
• Comparing NoSQL Data Stores
– http://www.quantschool.com/home/programming-2/
comparing_inmemory_data_stores/
Performance benchmarks ...
• NoSQL Performance when Scaling by RAM
– http://info.couchbase.com/rs/northscale/images/
NoSQL_Performance_Scaling_by_RAM.pdf
• Dissecting the NoSQL Benchmark
– http://blog.couchbase.com/dissecting-nosql-
benchmark/
• No SQL Performance Benchmark by SandStorm
– http://www.sandstormsolution.com/nosql.html
Performance benchmarks ...
• Benchmarking Couchbase Server
– http://www.slideshare.net/Couchbase/t1-s4-
couchbase-performancebenchmarkingv34/
• NoSQL Performance Benchmarks Series:
Couchbase
– http://blog.bigstep.com/big-data-performance/nosql-
performance-benchmarks-series-couchbase/
• Benchmarking Riak
– https://medium.com/@mustwin/benchmarking-riak-
bfee93493419/
Performance benchmarks ...
• NoSQL Fast? Not always. A benchmark
– http://machielgroeneveld.wordpress.com/2014/07/01/
nosql-fast/
• Finding the right NoSQL data store: Results for
my use case and a surprise
– https://www.paluch.biz/blog/124-finding-the-right-
nosql-data-store-results-for-my-use-case-and-a-
surprise.html
Performance benchmarks ...
• MongoDB Performance Pitfalls -- Behind The
Scenes
– http://blog.trackerbird.com/content/mongodb-
performance-pitfalls-behind-the-scenes/
• MySQL vs. MongoDB Disk Space Usage
– http://blog.trackerbird.com/content/mysql-vs-
mongodb-disk-space-usage/
• MongoDB: Scaling write performance
– http://www.slideshare.net/daumdna/mongodb-scaling-
write-performance/
Performance benchmarks ...
• MySql vs MongoDB performance benchmark
– http://www.moredevs.ro/mysql-vs-mongodb-
performance-benchmark/
• Postgres Outperforms MongoDB and Ushers in
New Developer Reality
– http://blogs.enterprisedb.com/2014/09/24/postgres-
outperforms-mongodb-and-ushers-in-new-developer-
reality/
Performance benchmarks ...
• Can the Elephants Handle the NoSQL
Onslaught?
– http://vldb.org/pvldb/vol5/
p1712_avriliafloratou_vldb2012.pdf
• Solving Big Data Challenges for Enterprise
Application Performance Management
– http://vldb.org/pvldb/vol5/
p1724_tilmannrabl_vldb2012.pdf
• NoSQL RDF
– https://github.com/ahaque/hive-hbase-rdf/
Performance benchmarks
• Benchmarking Graph Databases
– http://istc-bigdata.org/index.php/benchmarking-graph-
databases/
• Benchmarking Graph Databases -- Updates
– http://istc-bigdata.org/index.php/benchmarking-graph-
databases-updates/
• Linked Data Benchmark Council
– http://ldbc.eu/
Benchmarking tips ...
• How not to benchmark Cassandra
– http://www.datastax.com/dev/blog/how-not-to-
benchmark-cassandra
• How not to benchmark Cassandra: a case study
– http://www.datastax.com/dev/blog/how-not-to-
benchmark-cassandra-a-case-study
• Scaling NoSQL databases: 5 tips for increasing
performance
– http://radar.oreilly.com/2014/09/scaling-nosql-
databases-5-tips-for-increasing-performance.html
Benchmarking tips ...
• NoSQL Database Architecture and Performance:
How to Evaluate and Benchmark
– http://www.slideshare.net/altoros/nosql-cassandra-
mongodb-couchbase-comparison/
• How To Benchmark NoSQL Databases
– http://blog.bigstep.com/big-data-performance/
benchmark-nosql-databases/
Benchmarking tips
• Correcting YCSB’s Coordinated Omission
problem
– http://psy-lob-saw.blogspot.co.uk/2015/03/fixing-ycsb-
coordinated-omission.html
Stress testing ...
• Jepsen
– http://www.aphyr.com/tags/jepsen
• Jepsen: Testing the Partition Tolerance of
PostgreSQL, Redis, MongoDB and Riak
– http://www.infoq.com/articles/jepsen/
• The Man Who Tortures Databases
– http://www.informationweek.com/software/
information-management/the-man-who-tortures-
databases/240160850/
Stress testing ...
• Testing Network failure using NuoDB and
Jepsen, part 1
– http://dev.nuodb.com/techblog/testing-network-failure-
using-nuodb-and-jepsen-part-1
• Testing Network failure using NuoDB and
Jepsen, part 2
– http://dev.nuodb.com/techblog/testing-network-failure-
using-nuodb-and-jepsen-part-2
Stress testing
• Call Me Maybe: FoundationDB vs. Jepsen
– http://blog.foundationdb.com/call-me-maybe-
foundationdb-vs-jepsen
• Jepsen IV: Hope Springs Eternal
– http://www.thedotpost.com/2015/06/kyle-kingsbury-
jepsen-iv-hope-springs-eternal
Unit testing
• Unit Testing NoSQL Databases Applications with
NoSQLUnit
– http://www.methodsandtools.com/tools/nosqlunit.php
– https://github.com/lordofthejars/nosql-unit/
BI/Analytics
• BI/Analytics on NoSQL: Review of Architectures
Part 1
– http://www.dataversity.net/bianalytics-on-nosql-
review-of-architectures-part-1/
• BI/Analytics on NoSQL: Review of Architectures
Part 2
– http://www.dataversity.net/bianalytics-on-nosql-
review-of-architectures-part-2/
Various graphics ...
• Data Platforms Landscape map
– https://451research.com/state-of-the-database-
landscape/
• NoSQL LinkedIn Skills Index -- June 2015
– https://blogs.the451group.com/
information_management/2015/07/07/nosql-linkedin-
skills-index-june-2015/
Various graphics ...
• Necessity is the mother of NoSQL
– http://blogs.the451group.com/
information_management/2011/04/20/necessity-is-
the-mother-of-nosql/
• Making Sense of Big Data
– http://www.slideshare.net/infochimps/making-sense-
of-big-data/
• NoSQL, Heroku, and You
– https://blog.heroku.com/archives/2010/7/20/nosql/
Various graphics
• The NoSQL vs. SQL hoopla, another turn of the
screw!
– http://www.parelastic.com/blog/nosql-vs-sql-hoopla-
another-turn-screw/
• Navigating the Database Universe
– http://www.slideshare.net/lisapaglia/navigating-the-
database-universe/
Discussion fora
• LinkedIn NoSQL
– http://www.linkedin.com/groups?gid=2085042
• LinkedIn NewSQL
– http://www.linkedin.com/groups/NewSQL-4135938
• Google groups
– http://groups.google.com/group/nosql-discussion
• Quora
– https://www.quora.com/NoSQL/
NoSQL jokes/humour ...
• LinkedIn discussion thread
– http://www.linkedin.com/groups/NoSQL-Jokes-
Humour-2085042.S.177321213
• NoSQL Better Than MySQL?
– http://www.youtube.com/watch?v=QU34ZVD2ylY
– Shorter version of “Episode 1 -- MongoDB is Web
Scale”
• /dev/null vs. MongoDB benchmark bake-off
– http://engineering.wayfair.com/devnull-vs-mongodb-
benchmark-bake-off/
NoSQL jokes/humour ...
• say No! No! and No! (=NoSQL Parody)
– http://www.youtube.com/watch?v=fXc-QDJBXpw
• BREAKING: NoSQL just “huge text file and
grep”, study finds
– http://thescienceweb.wordpress.com/2014/10/28/
breaking-nosql-just-huge-text-file-and-grep-study-
finds/
NoSQL jokes/humour ...
• When someone brags about scaling MongoDB
to a whopping 100GB
– http://dbareactions.tumblr.com/post/62989609976/
when-someone-brags-about-scaling-mongodb-to-a
• Interview with the Ghost of MongoDB Scalability
– http://blog-shaner.rhcloud.com/interview-with-the-
ghost-of-mongodb-scalability/
• It’s Time to Breakup with Your Longtime RDBMS
– http://www.marklogic.com/blog/time-breakup-
longtime-rdbms/
NoSQL jokes/humour
• Barbie learns about NoSQL
– https://twitter.com/cskama/status/
535504624758063108/
• C.R.U.D.
– http://crudcomic.tumblr.com/
• Twitter
– @mongodbfacts
– @BigDataBorat
Miscellaneous ...
• PowerPoint template
– http://www.articulate.com/rapid-elearning/heres-a-
free-powerpoint-template-how-i-made-it/
• Autostereogram
– http://www.all-freeware.com/images/full/46590-
free_stereogram_screensaver_audio___multimedia_o
ther.jpeg
• Theatre Curtain Animations
– http://www.slideshare.net/chinateacher1/theater-
curtain-animations/
Miscellaneous ...
• Icons and images
– http://www.geekpedia.com/icons.php
– http://cemagraphics.deviantart.com/
– http://www.freestockphotos.biz/
– http://www.graphicsfuel.com/2011/09/comments-
speech-bubble-icon-psd/
– http://www.softicons.com/free-icons/
– http://icondock.com/
Miscellaneous
• Newspaper headlines
– http://www.imagechef.com/t/n8rm/Newspaper-
Headline/
Backup headlines
Source: Inspired by http://thescienceweb.wordpress.com/2014/10/28/breaking-nosql-just-huge-text-file-
and-grep-study-finds/

You might also like