Introduction To NOSQL and Cassandra: @rantav @outbrain
Introduction To NOSQL and Cassandra: @rantav @outbrain
And Cassandra
@rantav
@outbrain
SQL is good
• Rich language
• Easy to use and integrate
• Rich toolset
• Many vendors
SCALING
Scaling Solutions - Replication
Scales Reads
Scaling Solutions - Sharding
• Or - an array of SQLs
Consistency + Partition Tolerance (no Availability)
Availability + Partition Tolerance (no Consistency)
Consistency Levels
• Developed at facebook
• Opensourced at Apache
• Implemented in Java
CONSISTENCY DOWN TO EARTH
N/R/W
• QUORUM:
o R = N/2+1
o W = N/2+1
o => Fully consistent
Data Model - Forget SQL
struct Column {
1: binary name,
2: binary value,
3: i64 timestamp,
}
JSON-ish notation:
{
"name": "emailAddress",
"value": "[email protected]",
"timestamp": 123456789 }
Data Model - Column Family
Users: CF
ran: ROW
emailAddress: [email protected], COLUMN
webSite: http://bar.com COLUMN
f.rat: ROW
emailAddress: [email protected] COLUMN
Stats: CF
ran: ROW
visits: 243 COLUMN
Data Model - Songs example
Songs:
Meir Ariel:
Shir Keev: 6:13,
Tikva: 4:11,
Erol: 6:17
Suetz: 5:30
Dr Hitchakmut: 3:30
Mashina:
Rakevet Layla: 3:02
Optikai: 5:40
Data Model - Super Columns
Songs:
Meir Ariel:
Shirey Hag:
Shir Keev: 6:13,
Tikva: 4:11,
Erol: 6:17
Vegluy Eynaim:
Suetz: 5:30
Dr Hitchakmut: 3:30
Mashina:
...
Data Model - Super Columns
get
get_slice
multiget
multiget_slice
get_count
get_ranage_slice
get_ranage_slices
insert
remove
batch_insert
batch_mutate
The True API
• N - per keyspace
• R - per each read requests
• W - per each write request
Consistency Model
Cassandra defines:
enum ConsistencyLevel {
ZERO = 0,
ONE = 1,
QUORUM = 2,
DCQUORUM = 3,
ALL = 5,
}
Java Code
• Encapsulates thrift
• Adds JMX (Monitoring)
• Connection pooling
• Failover
• Open-sourced at github and has a growing
community of developers and users.
Java Client - Hector - cont
/**
* Insert a new value keyed by key
*
* @param key Key for the value
* @param value the String value to insert
*/
public void insert(final String key, final String value) {
Mutator m = createMutator(keyspaceOperator);
m.insert(key,
CF_NAME,
createColumn(COLUMN_NAME, value));
}
Java Client - Hector - cont
/**
* Get a string value.
*
* @return The string value; null if no value exists for the given key.
*/
public String get(final String key) throws HectorException {
ColumnQuery<String, String> q = createColumnQuery(keyspaceOperator, serializer, serializer);
Result<HColumn<String, String>> r = q.setKey(key).
setName(COLUMN_NAME).
setColumnFamily(CF_NAME).
execute();
HColumn<String, String> c = r.get();
return c == null ? null : c.getValue();
}
Extra
Cross-language protocol
Compiles to: C++, Java, PHP, Ruby, Erlang, Perl, ...
struct UserProfile {
1: i32 uid,
2: string name,
3: string blurb
}
service UserStorage {
void store(1: UserProfile user),
UserProfile retrieve(1: i32 uid)
}
Thrift
Generating sources:
BigTable http://labs.google.com/papers/bigtable.html
Dynamo http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html
From Dynamo:
• Symmetric p2p architecture
• Gossip based discovery and error detection
• Distributed key-value store
o Pluggable partitioning
o Pluggable topology discovery
• Eventual consistent and Tunable per operation
From BigTable
• p2p
• Enables seamless nodes addition.
• Rebalancing of keys
• Fast detection of nodes that goes down.
• Every node knows about all others - no
master.
Internals - Consistent Hashing
Memtables
Write Path
Compactions
Write Properties
• No reads
• No seeks
• Fast
• Atomic within ColumnFamily
Read Path
Reads
Read Properteis
• Merge keys
• Combine columns
• Discard tombstones
• Use bloom filters bitwise OR operation
SEDA
anti entropy
hinted handoff
repair on read
timestamps -> vector clocks
consistent hashing
merkle trees
References
• http://horicky.blogspot.com/2009/11/nosql-patterns.html
• http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon
-dynamo-sosp2007.pdf
• http://labs.google.com/papers/bigtable.html
• https://nosqleast.com/2009/
• http://bret.appspot.com/entry/how-friendfeed-uses-mysql
• http://www.julianbrowne.com/article/viewer/brewers-cap-
theorem
• http://www.allthingsdistributed.com/2008/12/eventually_cons
istent.html
• http://wiki.apache.org/cassandra/DataModel
• http://incubator.apache.org/thrift/