Bigdata and Nosql DBS: Piyushgupta July2013
Bigdata and Nosql DBS: Piyushgupta July2013
P I Y USH G UPTA J U LY 20 13
Background
Up until the internet age, most data was generated within the enterprise.
Emphasis on schema design, enterprise was in control. What data do I need to store?
Monetization from Increased Market Share, Increased Enterprise Ops Efficiency, New Consumer Products Internet of Things.
Our Things are talking to us! (Cars, Appliances, Home, Parking meters-www.sfpark.org)
Next up? Cars bid for parking with the meter!
Data Analytics.
Time to Generate Results months, days, hours, minutes, seconds, sub-second. Mathematical packages for analytics. (Statistical, Sentiment, Text search)
Results Presentation
User Specific Dynamic Web Content generated in real time driven by analytics. Data visualization
BigData Companies
BigGuys (~1% revenue is from BigData plays) IBM, Intel, Oracle, HP, Teradata, Fujitsu, Amazon (18%) About 120 BigData startups, many are VC funded A platform built for everyone is a platform built for no one! Good resources:
www.BigDataLandscape.com
www.451Research.com
NoSQL Databases
Common theme distributed key-value store.
Data is replicated. Some dbs require it, others make it optional. Most use cases will replicate.
Throughput , Capacity, Storage Technology, Latency - closely coupled. 4/40/400 Tb Rule of Thumb.
RAM memory lowest latency, highest cost => limited capacity. ns - s / 4 TB SSD Solid State Disk memory (Vendors: FusionIO, Violin, OCZ, HP, Intel)
Lower latency (50s R / 500 s W), $2-$4 /GB (was $20), .x - x ms / 40 - 100TB clusters.
HDD - Rotational Disk highest latency, lowest cost, xx - xxx ms/ 400 Tb+.
CA High consistency and availability, accept no response if a node goes down. (Challenge then how is this A?)
AP High availability on a cluster of nodes, accept Eventual Consistency.
Professor Eric Brewer, UC Berkeley, explains the CAP theorem
CP High consistency and cluster scalability, accept some unavailability during node failure/addition.
=> Understand a DBs design goal. Unfair to compare CP vs AP on latency. Most NoSQL dbs are either AP or CP dominant.
Comparing Performance
Hard to create a level playing field when comparing performance across dbs. What matters to your application? Seek best performance for your use case.
Basic Operations
Any Distributed Database offers these basic operations. WRITE (Update, Delete)
Map-Reduce Concept
Data: a, x, z, d, b, x, e, d, a, b, c Need: Count of x Map: Count(x) => result Reduce: Sum(result) => sum
Application Server Needs sum
Client Server
dB - Node 0 a, x, z, d
dB - Node 1 b, x, e, d
dB - Node 2 a, b, c
Map-Reduce Concept
Storage Network store data distributed on servers. Bring data to client server for computation. Distributed DB shard or split / spread data across db servers. MAP Phase - Client Server requests db server nodes to compute on their share of the data and return the results. (dB Server side User Defined Function) REDUCE Phase - Client assimilates data from nodes, computes net result. (Client server side User Defined Function) Can only do partial map-reduce if computation on one record subsequently depends on data in another record.
Piyush Gupta / July 2013
Evaluating dB Performance
Understand the WRITE path of a database architecture. READ path trails the WRITE path.
a, x, z, d
Node 0
Replica 1
Node 0 Replica 2
Master-Slave architecture vs Shared Nothing architecture. Partition Tolerance is the major difference. Both node failure and network failure to master affect performance.
R2 1 2 1
Node 0
- Aware of peers (multicast) - Knows partition table - Write Buffer Memory - Persistent Store SSD
Piyush Gupta / July 2013
Node 1 - Master gets data - Writes to its membuf - Updates Index table in mem - Sends to R1, R2 - Waits for R1, R2.
MongoDB, RethinkDB do not hash primary key allows range queries on Primary Key.
In hashed key case, user can store Primary Key copy (if integer) in a bin, build a secondary index on it and do range queries.
X=6
Multiple clients are reading and trying to modify x. Pessimistic Locking: Lock before read for a client, release lock after client has done writing. What if client never returns?
Optimistic Locking: (Also called CAS Check and Set) Give client x & metadata on read, lock only at commit. Client provides new x and metadata at write. If metadata mismatches, client retries.
=> Pro/Con: x is open for read while one client is about to modify it.
Average Latency, ms
Cassandra
MongoDB
150,000
2.5
Average Latency, ms
7.5
Cassandra MongoDB
2.5
50,000
100,000
150,000
200,000
Throughput, ops/sec
Take Away
Engineers are the smartest people on earth! (OK, I am biased)
Use Cases
Aerospike: Ideally suited for K-V store, ~2-3ms latency, 500K ops/sec, 1 100 Tb SSD store. Efficiently using SSDs is their secret sauce!
Potential Use Cases:
Real Time Ad Bidding Market Real Time Sports Analytics (Check out www.SportsVision.com) Real time interaction with Things
MongoDB, Couchbase2.0: Ideally suited for Document Store, 16MB/20Mb per document, low throughput requirements, relaxed latency requirements. Replication and Scalability is not without its challenges. (Web Server Developers like its APIs, server side java scripting and json data pipeline)
Cassandra: Ideally suited for relaxed consistency requirements, large capacity (400Tb), data matrix is sparsely populated, columnar database, K-V store. (Netflix and Ebay are major adopters) Hbase: Ideally suited as K-V store in the ETL pipeline on a Hadoop cluster best capacity, higher latency and lower throughput (~35ms at 20K ops/sec) Facebook switched from Cassandra to Hbase in 2010.
Piyush Gupta / July 2013
Read the details and chose the correct DB for your application!
Nomenclature
Equivalents....
RDBMS Aerospike MongoDB, CouchBaseDB* Cassandra HBase
database tablespace
table primary key row column
namespace
set key record bin
db
collection _id document n/a (json)
keyspace
column-family row-id row column-name
www.aerospike.com