0% found this document useful (0 votes)

82 views

Bigdata and Nosql DBS: Piyushgupta July2013

BigData and NoSQL databases are discussed. A huge amount of data is now coming from sources like the internet and requires new approaches to data storage and analytics. NoSQL databases provide distributed and scalable alternatives to traditional SQL databases. Key features of various NoSQL databases are compared, including data models, distribution methods, and performance characteristics. Performance tests show differences in throughput and latency between databases under various workloads.

Uploaded by

skathpalia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views

Bigdata and Nosql DBS: Piyushgupta July2013

Uploaded by

skathpalia

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 27

BigData and NoSQL dBs

P I Y USH G UPTA J U LY 20 13

Background
Up until the internet age, most data was generated within the enterprise.
Emphasis on schema design, enterprise was in control. What data do I need to store?

A huge amount of data is now coming at the enterprise.

The new paradigm requires upfront query design. What business intelligence do I want to extract from everything thats coming at me?

Monetization from Increased Market Share, Increased Enterprise Ops Efficiency, New Consumer Products Internet of Things.

Piyush Gupta / July 2013

Internet of Things We Need Things - Our Things need us too!

Things are smart, loaded with sensors, cpus and connectivity wi-fi, cellular. Hotspots are ubiquitous. Home, PoleTops, Stores. Things are talking to each other.
Cars will talk to each other. (www.waze.com )

Our Things are talking to us! (Cars, Appliances, Home, Parking meters-www.sfpark.org)
Next up? Cars bid for parking with the meter!

Hyundai BlueLink, emails owner to schedule maintenance!

Calm down Dave, I have found parking!!!

Piyush Gupta / July 2013

BigData Play Areas

Data Repository
Data Acquisition (eg web crawlers, server logs, ETL:Extract-Transform-Load) Data Storage (various dB solutions)
R/W Latency, throughput, storage technology, capacity. (4/40/400/4000 TB)

Monitoring of the dB (DevOps)

Data Analytics.
Time to Generate Results months, days, hours, minutes, seconds, sub-second. Mathematical packages for analytics. (Statistical, Sentiment, Text search)

Results Presentation
User Specific Dynamic Web Content generated in real time driven by analytics. Data visualization

Piyush Gupta / July 2013

BigData Companies
BigGuys (~1% revenue is from BigData plays) IBM, Intel, Oracle, HP, Teradata, Fujitsu, Amazon (18%) About 120 BigData startups, many are VC funded A platform built for everyone is a platform built for no one! Good resources:

www.BigDataLandscape.com
www.451Research.com

Piyush Gupta / July 2013

The BigData Landscape!

Source: www.BigDataLandscape.com (Dave Feinleib)

Piyush Gupta / July 2013

NoSQL Databases
Common theme distributed key-value store.
Data is replicated. Some dbs require it, others make it optional. Most use cases will replicate.

Throughput , Capacity, Storage Technology, Latency - closely coupled. 4/40/400 Tb Rule of Thumb.
RAM memory lowest latency, highest cost => limited capacity. ns - s / 4 TB SSD Solid State Disk memory (Vendors: FusionIO, Violin, OCZ, HP, Intel)
Lower latency (50s R / 500 s W), $2-$4 /GB (was $20), .x - x ms / 40 - 100TB clusters.

HDD - Rotational Disk highest latency, lowest cost, xx - xxx ms/ 400 Tb+.

Piyush Gupta / July 2013

CAP Theorem for Distributed Databases (E. Brewer, UC Berkeley, 2000)

C-Consistency (across replicas) , A-Availability (latency, guaranteed response), P-Partition Tolerance (tolerance to a node going down) You can only have two of the three. (Challenged by some!)

CA High consistency and availability, accept no response if a node goes down. (Challenge then how is this A?)
AP High availability on a cluster of nodes, accept Eventual Consistency.
Professor Eric Brewer, UC Berkeley, explains the CAP theorem

CP High consistency and cluster scalability, accept some unavailability during node failure/addition.
=> Understand a DBs design goal. Unfair to compare CP vs AP on latency. Most NoSQL dbs are either AP or CP dominant.

Piyush Gupta / July 2013

Comparing Performance
Hard to create a level playing field when comparing performance across dbs. What matters to your application? Seek best performance for your use case.

NoSQL DB designs are targeted.

Latency (~1 ms, 10ms, 50ms) , Read heavy/Write heavy, Consistency Throughput operations / sec. Storage Medium - RAM vs SSD vs HDD Storage Capacity 4 /40/400/4000 Tb Storage Format - HDFS / NFS / Log Structured File system. Content type (K-V store, Document Store, Graph. Upto 1Mb / 1-10Mb / >10 Mb size)

Piyush Gupta / July 2013

Basic Operations
Any Distributed Database offers these basic operations. WRITE (Update, Delete)

READ (Indexing to speed reads)

Map-Reduce (Optional) or an Aggregation Pipeline Bring computation to data.

Use network bandwidth to move results instead of data.

Incurs communication overhead.

Piyush Gupta / July 2013

Map-Reduce Concept
Data: a, x, z, d, b, x, e, d, a, b, c Need: Count of x Map: Count(x) => result Reduce: Sum(result) => sum
Application Server Needs sum

Client Server

Sends map request, runs reduce

dB - Node 0 a, x, z, d

dB - Node 1 b, x, e, d

dB - Node 2 a, b, c

Piyush Gupta / July 2013

Map-Reduce Concept
Storage Network store data distributed on servers. Bring data to client server for computation. Distributed DB shard or split / spread data across db servers. MAP Phase - Client Server requests db server nodes to compute on their share of the data and return the results. (dB Server side User Defined Function) REDUCE Phase - Client assimilates data from nodes, computes net result. (Client server side User Defined Function) Can only do partial map-reduce if computation on one record subsequently depends on data in another record.
Piyush Gupta / July 2013

Evaluating dB Performance
Understand the WRITE path of a database architecture. READ path trails the WRITE path.

How is the data indexed?

Indexing options in a database: Primary Key Index, Secondary Indexes Indexing speeds up READS Indexing slows down WRITES Indexing slows down REPLICATION

What Data Model is the db optimized for?

Piyush Gupta / July 2013

Data Models for NoSQL Databases

All distributed databases can be argued are Key-Value stores. Record based organization. (May support TTL Time To Live)
All columns are reasonably populated, values are ints, strings, blobs. Aersopike (128Kb 1Mb, C), Riak (1 10 Mb, Erlang)

Columnar databases sparsely populated row-column matrix

Cassandra, HBase

Document Store databases Key:BLOB store. Binary Large Object

Store JSON object as BSON BLOB.
MongoDB (16 Mb, C++), CouchBaseDB (1Mb Mem / 20Mb HDD, Erlang)

Graph Databases (neo4j) who is related to whom (Facebook, LinkedIn)

Nodes / vertices sides/edges store.
Piyush Gupta / July 2013

Master- Slave Distribution Model (eg MongoDB)

Application Server Config Server Shard Node 0 Shard Node 1 b, x, e, d Node 1 Replica 1 Node 1 Replica 2 Config Server Failover 1 Config Server Failover 2

Shard Node 2 a, b, c Node 2 Replica 1 Node 2 Replica 2

a, x, z, d
Node 0

Replica 1
Node 0 Replica 2

Master-Slave architecture vs Shared Nothing architecture. Partition Tolerance is the major difference. Both node failure and network failure to master affect performance.

Piyush Gupta / July 2013

Peer to Peer Distribution Model Shared Nothing. (eg Aerospike)

put ns, set, {pk:rec1,bin1:1, bin2:test} Hash(set, pk) => 20 byte digest, (160 bits) 0110011.001101010111.001010 12 specific bits => 4k partitions Partition Table:
Partition # M 0 4095 2 1 0 R1 0 0 2
Application Server put

R2 1 2 1

Client Server - Hash(set,pk) - Lookup Partition Table - Send record to master

Node 0
- Aware of peers (multicast) - Knows partition table - Write Buffer Memory - Persistent Store SSD
Piyush Gupta / July 2013

Node 1 - Master gets data - Writes to its membuf - Updates Index table in mem - Sends to R1, R2 - Waits for R1, R2.

Node 2 - Async write to SSD

- Update Index table in mem

- For Reads, lookup index table. - Index Table Entry: 20 bytes of digest + 64 bytes of meta-data.

Data Distribution (Sharding)

Hashing the Records Primary Key prevents hot spotting. Incoming data is auto sharded.

MongoDB, RethinkDB do not hash primary key allows range queries on Primary Key.
In hashed key case, user can store Primary Key copy (if integer) in a bin, build a secondary index on it and do range queries.

Piyush Gupta / July 2013

Optimistic Locking vs Pessimistic Locking

Consider:
read (x)
Increment x by 1 write (x)
C1 -R X=5 C2 -R X=5 C2 W C3 -R X=6 C3 -W X=7 C1 -W X=6

X=6

Multiple clients are reading and trying to modify x. Pessimistic Locking: Lock before read for a client, release lock after client has done writing. What if client never returns?

Optimistic Locking: (Also called CAS Check and Set) Give client x & metadata on read, lock only at commit. Client provides new x and metadata at write. If metadata mismatches, client retries.
=> Pro/Con: x is open for read while one client is about to modify it.

Piyush Gupta / July 2013

Performance Data Throughput (SSD synch) (Thumbtack)

Balanced Workload Read Latency (Full view)

Balanced workload, READ latency

350,000 10 300,000 Aerospike 250,000 200,000 Aerospike Cassandra MongoDB 100,000 50,000 0 0 Balanced Read-Heavy 0 50,000 100,000 150,000 200,000 Throughput, ops/sec 7.5

Average Latency, ms

Cassandra
MongoDB

150,000

2.5

Piyush Gupta / July 2013

Performance Data Partition Tolerance (SSD)

Node goes down, to node back up ~500 sec. downtime. Node joining cluster causes automatic data rebalancing on Aersospike.

Piyush Gupta / July 2013

Performance Data Throughput (Altoros)

Data on MongoDB, Riak, Hbase and Cassandra (Low throughput range)
Consistent with Thumbtacks data. Hbase data consistent with Hstack data.
Balanced Workload Read Latency (Full view)

Balanced workload, READ latency

10
Aerospike

Average Latency, ms

7.5

Cassandra MongoDB

2.5

50,000

100,000

150,000

200,000

Throughput, ops/sec

Piyush Gupta / July 2013

Evaluating Performance -Things to look for.

Do the Test Conditions apply to your use case? A well designed DB will give you a throughput of ~80% of the network bandwidth. CPU usage should not max out, Queue size shouldnt bloom. DB design should not result in hot-spotting ie excessive load on a particular node.

Pessimistic locking vs optimistic locking. Excessive locking causes performance deterioration.

Faulty node, mismatched nodes, non-identical h/w configuration across nodes affect performance => Have a uniform cluster.
Piyush Gupta / July 2013

Take Away
Engineers are the smartest people on earth! (OK, I am biased)

All designs are good!

Database Design => Pros & Cons (CAP Theorem Rules!) =>Match your use case to what the db is designed for<=

Piyush Gupta / July 2013

Use Cases
Aerospike: Ideally suited for K-V store, ~2-3ms latency, 500K ops/sec, 1 100 Tb SSD store. Efficiently using SSDs is their secret sauce!
Potential Use Cases:
Real Time Ad Bidding Market Real Time Sports Analytics (Check out www.SportsVision.com) Real time interaction with Things

MongoDB, Couchbase2.0: Ideally suited for Document Store, 16MB/20Mb per document, low throughput requirements, relaxed latency requirements. Replication and Scalability is not without its challenges. (Web Server Developers like its APIs, server side java scripting and json data pipeline)
Cassandra: Ideally suited for relaxed consistency requirements, large capacity (400Tb), data matrix is sparsely populated, columnar database, K-V store. (Netflix and Ebay are major adopters) Hbase: Ideally suited as K-V store in the ETL pipeline on a Hadoop cluster best capacity, higher latency and lower throughput (~35ms at 20K ops/sec) Facebook switched from Cassandra to Hbase in 2010.
Piyush Gupta / July 2013

Performance Comparisons - References

Cassandra Hbase MongoDB Riak (www.altoros.com)
http://www.networkworld.com/news/tech/2012/102212-nosql-263595.html?page=1

Cassandra CouchDB MongoDB Aerospike (www.thumbtack.net)

http://www.slideshare.net/bengber/no-sql-presentation

Hbase Performance Tests. (www.hstack.org)

http://hstack.org/hbase-performance-testing concise, good info.

Read the details and chose the correct DB for your application!

Piyush Gupta / July 2013

Nomenclature
Equivalents....
RDBMS Aerospike MongoDB, CouchBaseDB* Cassandra HBase

database tablespace
table primary key row column

namespace
set key record bin

db
collection _id document n/a (json)

keyspace
column-family row-id row column-name

Piyush Gupta / July 2013

Obligatory Plug for my Employer!

Aerospike Community Edition available for Free Download. 2 node cluster / 200 Gb store.

www.aerospike.com

Look me up on LinkedIn! Piyush Gupta Aerospike [email protected]

Piyush Gupta / July 2013

Quiz - SecOps Platform Sales Certification - Attempt Review
No ratings yet
Quiz - SecOps Platform Sales Certification - Attempt Review
7 pages
Stream Processing Hands On With Apache Flink Free Lms Version
No ratings yet
Stream Processing Hands On With Apache Flink Free Lms Version
232 pages
Certified SOC Analyst Dumps
No ratings yet
Certified SOC Analyst Dumps
17 pages
NOSQL
No ratings yet
NOSQL
23 pages
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
From Everand
Exploring Hadoop Ecosystem (Volume 2): Stream Processing
Wei Liu
No ratings yet
Chapter 6 Review Answers
100% (3)
Chapter 6 Review Answers
2 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
29 pages
Introduction To Nosql: Gabriele Pozzani
No ratings yet
Introduction To Nosql: Gabriele Pozzani
49 pages
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
No ratings yet
CIS - 468 - 04 - NOSQL Databases and Big Data Storage Systems
102 pages
PPT 2.2.1
No ratings yet
PPT 2.2.1
26 pages
Lecture 1
No ratings yet
Lecture 1
31 pages
Lecture 1 - NoSQL
No ratings yet
Lecture 1 - NoSQL
31 pages
Bcse302l Dbms Module-7 Nosql
No ratings yet
Bcse302l Dbms Module-7 Nosql
30 pages
NoSql 2024 Assign2
No ratings yet
NoSql 2024 Assign2
189 pages
NoSQL MongoDB HBase Cassandra
100% (1)
NoSQL MongoDB HBase Cassandra
142 pages
NoSQL Big Data Management
No ratings yet
NoSQL Big Data Management
36 pages
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
No ratings yet
Massively Parallel Cloud Data Storage Systems: S. Sudarshan IIT Bombay
17 pages
No SQL
No ratings yet
No SQL
109 pages
Introduction To NoSQL
No ratings yet
Introduction To NoSQL
43 pages
41 NoSQL Introduction.pptx
No ratings yet
41 NoSQL Introduction.pptx
18 pages
NOSQL Lecture 1 Notes
No ratings yet
NOSQL Lecture 1 Notes
31 pages
Nosql What Does It Mean
No ratings yet
Nosql What Does It Mean
15 pages
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
No ratings yet
Nosql Databases: P.Krishna Reddy Iiit Hyderabad
30 pages
RK NoSQL
No ratings yet
RK NoSQL
35 pages
Introduction to NoSQL
No ratings yet
Introduction to NoSQL
13 pages
CC - Lecture 6-Data
No ratings yet
CC - Lecture 6-Data
44 pages
Big Data Analytics Lecture 3A
No ratings yet
Big Data Analytics Lecture 3A
27 pages
Nosql Overview: Implementation Free
No ratings yet
Nosql Overview: Implementation Free
40 pages
MODULE 3
No ratings yet
MODULE 3
37 pages
Chapter 5-NoSQL PDF
No ratings yet
Chapter 5-NoSQL PDF
47 pages
NoSQL Database
No ratings yet
NoSQL Database
64 pages
No SQLMongo DB
No ratings yet
No SQLMongo DB
47 pages
No SQL
No ratings yet
No SQL
19 pages
11-NoSQL_Nhom8
No ratings yet
11-NoSQL_Nhom8
72 pages
Seminar Topic Nosql
No ratings yet
Seminar Topic Nosql
73 pages
unit 4 BDA
No ratings yet
unit 4 BDA
22 pages
NGT Paper
No ratings yet
NGT Paper
25 pages
NOsql Presentation
No ratings yet
NOsql Presentation
20 pages
Big Data
No ratings yet
Big Data
53 pages
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
No ratings yet
Cs 620 / Dasc 600 Introduction To Data Science & Analytics: Lecture 6-Nosql
31 pages
Explain The Term Nosql'. Describe Vertical and Horizontal Scaling
No ratings yet
Explain The Term Nosql'. Describe Vertical and Horizontal Scaling
13 pages
NGT NOV-19 (Sol) (E-next.in)
No ratings yet
NGT NOV-19 (Sol) (E-next.in)
33 pages
Big Data Storage and Processing
No ratings yet
Big Data Storage and Processing
49 pages
Introduction To Big Data and NoSQL
No ratings yet
Introduction To Big Data and NoSQL
52 pages
Introduction To: Nosql
No ratings yet
Introduction To: Nosql
27 pages
NoSQL (1)
No ratings yet
NoSQL (1)
12 pages
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
No ratings yet
Unit 4: Big Data Tehnology Landscape Two Inportant Technologies
42 pages
Big data Slides
No ratings yet
Big data Slides
26 pages
NoSQL_Notes
No ratings yet
NoSQL_Notes
11 pages
09 - Cloud-Enabling Technologies - v2
No ratings yet
09 - Cloud-Enabling Technologies - v2
45 pages
SYSTEM DESIGN.docx (1)
No ratings yet
SYSTEM DESIGN.docx (1)
6 pages
R23-IDS-Unit3-PPT
No ratings yet
R23-IDS-Unit3-PPT
36 pages
Unit Ii - Nosql Databases
No ratings yet
Unit Ii - Nosql Databases
112 pages
Data Engineering Unit 3
No ratings yet
Data Engineering Unit 3
4 pages
No SQL Lecture Notes
No ratings yet
No SQL Lecture Notes
17 pages
Fdocuments - in Nosql-Seminar
No ratings yet
Fdocuments - in Nosql-Seminar
40 pages
IntroNoSQL Revised
No ratings yet
IntroNoSQL Revised
28 pages
8.4 NoSQL Database
No ratings yet
8.4 NoSQL Database
36 pages
BIG - DATA - Unit 4
No ratings yet
BIG - DATA - Unit 4
99 pages
Bda - 4 Unit
No ratings yet
Bda - 4 Unit
10 pages
CloudComputing DATABASE
No ratings yet
CloudComputing DATABASE
27 pages
NoSQL Database
No ratings yet
NoSQL Database
8 pages
4 - Key-Value Storage
No ratings yet
4 - Key-Value Storage
109 pages
Administering ArcGIS for Server
From Everand
Administering ArcGIS for Server
Hussein Nasser
No ratings yet
Varutra - Corporate Profile PDF
No ratings yet
Varutra - Corporate Profile PDF
14 pages
Usmm SEARCH - SAP - MENU Show The Menu Path To Use To Execute A Given Tcode. You
No ratings yet
Usmm SEARCH - SAP - MENU Show The Menu Path To Use To Execute A Given Tcode. You
6 pages
"Claims and Proof of Delivery Automation": KIIT, Deemed To Be UNIVERSITY
No ratings yet
"Claims and Proof of Delivery Automation": KIIT, Deemed To Be UNIVERSITY
31 pages
Course Title Buss. Intellegency
No ratings yet
Course Title Buss. Intellegency
5 pages
KST EthernetKRL 22 en
No ratings yet
KST EthernetKRL 22 en
79 pages
Malware Analysis Guide
No ratings yet
Malware Analysis Guide
19 pages
Purchasing Officers List 2017
No ratings yet
Purchasing Officers List 2017
93 pages
FInal Exam ITPMDR - Irwan Alfiansyah
No ratings yet
FInal Exam ITPMDR - Irwan Alfiansyah
12 pages
Online Smart Service System Data Sheet
No ratings yet
Online Smart Service System Data Sheet
2 pages
Introducing-Collibra-Data-Intelligence-Cloud-Ebook
No ratings yet
Introducing-Collibra-Data-Intelligence-Cloud-Ebook
25 pages
White Paper - Enterprise Content Management
No ratings yet
White Paper - Enterprise Content Management
24 pages
V Mart SRS
No ratings yet
V Mart SRS
26 pages
Abdul Rehman CV
No ratings yet
Abdul Rehman CV
5 pages
3.7 System Implementation: Hardware and Software Acquisition
No ratings yet
3.7 System Implementation: Hardware and Software Acquisition
9 pages
2.o
No ratings yet
2.o
2 pages
cloud computing
No ratings yet
cloud computing
10 pages
Executive Information Systems Eis Development PDF
No ratings yet
Executive Information Systems Eis Development PDF
2 pages
Automated Invigilation Assignment System: Synopsis ON
No ratings yet
Automated Invigilation Assignment System: Synopsis ON
10 pages
ADBMS Practicals
100% (1)
ADBMS Practicals
75 pages
Itex Report
No ratings yet
Itex Report
12 pages
CRM Service Contracts Conversions - Usage of Contracts Public Apis
No ratings yet
CRM Service Contracts Conversions - Usage of Contracts Public Apis
16 pages
Blockchain Technology - Group B
No ratings yet
Blockchain Technology - Group B
35 pages
Quiz 1
No ratings yet
Quiz 1
10 pages
4-Requirement-Engineering
No ratings yet
4-Requirement-Engineering
43 pages
Sistem Pendukung Keputusan Untuk Menentukan Jurusan Pada Siswa Sma Menggunakan Metode KNN Dan Smart
No ratings yet
Sistem Pendukung Keputusan Untuk Menentukan Jurusan Pada Siswa Sma Menggunakan Metode KNN Dan Smart
10 pages
Tamilrockers Latest Links
No ratings yet
Tamilrockers Latest Links
26 pages

Bigdata and Nosql DBS: Piyushgupta July2013

Uploaded by

Bigdata and Nosql DBS: Piyushgupta July2013

Uploaded by

BigData and NoSQL dBs

A huge amount of data is now coming at the enterprise.

Piyush Gupta / July 2013

Internet of Things We Need Things - Our Things need us too!

Hyundai BlueLink, emails owner to schedule maintenance!

Calm down Dave, I have found parking!!!

BigData Play Areas

Monitoring of the dB (DevOps)

Piyush Gupta / July 2013

Piyush Gupta / July 2013

The BigData Landscape!

Piyush Gupta / July 2013

Piyush Gupta / July 2013

CAP Theorem for Distributed Databases (E. Brewer, UC Berkeley, 2000)

Piyush Gupta / July 2013

NoSQL DB designs are targeted.

Piyush Gupta / July 2013

READ (Indexing to speed reads)

Use network bandwidth to move results instead of data.

Piyush Gupta / July 2013

Sends map request, runs reduce

Piyush Gupta / July 2013

How is the data indexed?

What Data Model is the db optimized for?

Piyush Gupta / July 2013

Data Models for NoSQL Databases

Columnar databases sparsely populated row-column matrix

Document Store databases Key:BLOB store. Binary Large Object

Graph Databases (neo4j) who is related to whom (Facebook, LinkedIn)

Master- Slave Distribution Model (eg MongoDB)

Shard Node 2 a, b, c Node 2 Replica 1 Node 2 Replica 2

Piyush Gupta / July 2013

Peer to Peer Distribution Model Shared Nothing. (eg Aerospike)

Client Server - Hash(set,pk) - Lookup Partition Table - Send record to master

Node 2 - Async write to SSD

- Update Index table in mem

Data Distribution (Sharding)

Piyush Gupta / July 2013

Optimistic Locking vs Pessimistic Locking

Piyush Gupta / July 2013

Performance Data Throughput (SSD synch) (Thumbtack)

Balanced workload, READ latency

Piyush Gupta / July 2013

Performance Data Partition Tolerance (SSD)

Piyush Gupta / July 2013

Performance Data Throughput (Altoros)

Balanced workload, READ latency

Piyush Gupta / July 2013

Evaluating Performance -Things to look for.

Pessimistic locking vs optimistic locking. Excessive locking causes performance deterioration.

All designs are good!

Piyush Gupta / July 2013

Performance Comparisons - References

Cassandra CouchDB MongoDB Aerospike (www.thumbtack.net)

Hbase Performance Tests. (www.hstack.org)

Piyush Gupta / July 2013

Piyush Gupta / July 2013

Obligatory Plug for my Employer!

Look me up on LinkedIn! Piyush Gupta Aerospike [email protected]

Piyush Gupta / July 2013

You might also like