NoSQL: Basics
Alexey Zinovyev, Java/BigData Trainer in EPAM
With IT since 2007
With Java since 2009 About
With Hadoop since 2012
With EPAM since 2015
Contacts
E-mail : [email protected]
Twitter : @zaleslaw @BigDataRussia
Facebook: https://www.facebook.com/zaleslaw
vk.com/big_data_russia Big Data Russia
vk.com/java_jvm Java & JVM langs
Training from Zinoviev Alexey 3
The Good Old Days
Training from Zinoviev Alexey 4
One of these fine days...
Training from Zinoviev Alexey 5
We have a NoSQL job for you, son!
Training from Zinoviev Alexey 6
But you like SQL and HATE nontraditional data
Training from Zinoviev Alexey 7
Lets talk about it, boy...
Training from Zinoviev Alexey 8
The Database Market
Training from Zinoviev Alexey 9
Lets use Cassandra to keep logs ONLY ..
Training from Zinoviev Alexey 10
For logs
ONLY?
Training from Zinoviev Alexey 11
Case #0 : PosgreSQL & Cassandra
Training from Zinoviev Alexey 12
Modern
Backend Java in 2016
Development in 1997
Training from Zinoviev Alexey 13
Backend Development in 2017
Training from Zinoviev Alexey 14
10^6
rows in
MySQL
Training from Zinoviev Alexey 15
GB->TB->PB->?
Training from Zinoviev Alexey 16
This chapter covers
Highload applications
Sharding & replication
NoSQL databases
Training from Zinoviev Alexey 17
Case #1 : Likes in Classmates
Training from Zinoviev Alexey 18
HIGHLOAD
APPLICATIONS
Training from Zinoviev Alexey 19
Case #2 : Pet E-Commerce
Training from Zinoviev Alexey 20
Simple web application #1
Training from Zinoviev Alexey 21
Simple web application #2
Training from Zinoviev Alexey 22
Simple web application #3
Training from Zinoviev Alexey 23
Simple web application #4
Training from Zinoviev Alexey 24
Simple web application #5
Training from Zinoviev Alexey 25
Simple web application #6
Training from Zinoviev Alexey 26
Simple web application #7
Training from Zinoviev Alexey 27
Simple web application #8
Training from Zinoviev Alexey 28
Big Table
Training from Zinoviev Alexey 29
Sharding
Training from Zinoviev Alexey 30
Classic Sharding
Training from Zinoviev Alexey 31
Sharding with Locator
Training from Zinoviev Alexey 32
Case #3 : Supercomputer for President
Training from Zinoviev Alexey 33
Motivation: Scale-up vs scale-out
16 CPUs Scale - Up 48 CPUs
16 CPUs Scale - Out 16 CPUs 16 CPUs 16 CPUs
Training from Zinoviev Alexey 34
Motivation: scalability
5TB
I
300GB
50GB
Training from Zinoviev Alexey 35
Motivation: Fault tolerance
Training from Zinoviev Alexey 36
Motivation: Fault tolerance
Training from Zinoviev Alexey 37
Case #4 : DDos attack against retail company
Training from Zinoviev Alexey 38
Replication
Training from Zinoviev Alexey 39
You'd measure performance!
Training from Zinoviev Alexey 40
NOSQL
Training from Zinoviev Alexey 41
Replication
Training from Zinoviev Alexey 42
CAP
Theorem
Training from Zinoviev Alexey 43
Databases
Cassandra
Hbase
Neo4j
Riak
Training from Zinoviev Alexey 44
Do you know data models in
these databases?
Training from Zinoviev Alexey 45
Databases with types
Cassandra (column family)
Hbase (column family)
Neo4j (graph)
Riak (key-value)
Training from Zinoviev Alexey 46
Database
party
Training from Zinoviev Alexey 47
Whats the problem with RBDMSs
Caching
Master/Slave
Cluster
Table Partitioning
Sharding
Training from Zinoviev Alexey 48
Evolution
Training from Zinoviev Alexey 49
Flowers in your garden
Data Model Performance Scalability Flexibility Complexity Functionality
Keyvalue high high high none variable (none)
Stores
Training from Zinoviev Alexey 50
Flowers in your garden
Data Model Performance Scalability Flexibility Complexity Functionality
Keyvalue high high high none variable (none)
Stores
Column high high moderate low minimal
Store
Training from Zinoviev Alexey 51
Flowers in your garden
Data Model Performance Scalability Flexibility Complexity Functionality
Keyvalue high high high none variable (none)
Stores
Column high high moderate low minimal
Store
Document high variable high low variable (low)
Store (high)
Training from Zinoviev Alexey 52
Flowers in your garden
Data Model Performance Scalability Flexibility Complexity Functionality
Keyvalue high high high none variable (none)
Stores
Column high high moderate low minimal
Store
Document high variable high low variable (low)
Store (high)
Graph variable variable high high graph theory
Database
Training from Zinoviev Alexey 53
Flowers in your garden
Data Model Performance Scalability Flexibility Complexity Functionality
Keyvalue high high high none variable (none)
Stores
Column high high moderate low minimal
Store
Document high variable high low variable (low)
Store (high)
Graph variable variable high high graph theory
Database
Relational variable variable low moderate relational
Database algebra
Training from Zinoviev Alexey 54
Gentle NoSQL
Scalability
Nodes and Data Centers
Easy to add new server
Specific data model
Eventual consistency
Training from Zinoviev Alexey 55
ACID in
SQL
Training from Zinoviev Alexey 56
Atomicity in NoSQL
read-write-modify (CAS)
key/row manipulation is atomic
API for atomic operations
bad support of transactions
Training from Zinoviev Alexey 57
BASE
basic availability all queries will be finished
soft state state can be changed without writing
eventual consistency
Training from Zinoviev Alexey 58
Flowers in your garden
Database Data model Query API Data storage system
Cassandra Column Family Thrift Memtable/SSTable
Training from Zinoviev Alexey 59
Flowers in your garden
Database Data model Query API Data storage system
Cassandra Column Family Thrift Memtable/SSTable
CouchDB Documents Map/Reduce Append-only-B-tree
Training from Zinoviev Alexey 60
Flowers in your garden
Database Data model Query API Data storage system
Cassandra Column Family Thrift Memtable/SSTable
CouchDB Documents Map/Reduce Append-only-B-tree
Hbase Column Family Thrift, REST Memtable/SSTable on
HDFS
Training from Zinoviev Alexey 61
Flowers in your garden
Database Data model Query API Data storage system
Cassandra Column Family Thrift Memtable/SSTable
CouchDB Documents Map/Reduce Append-only-B-tree
Hbase Column Family Thrift, REST Memtable/SSTable on
HDFS
MongoDB Documents Cursor B-tree
Training from Zinoviev Alexey 62
Flowers in your garden
Database Data model Query API Data storage system
Cassandra Column Family Thrift Memtable/SSTable
CouchDB Documents Map/Reduce Append-only-B-tree
Hbase Column Family Thrift, REST Memtable/SSTable on
HDFS
MongoDB Documents Cursor B-tree
Neo4j Edges/Verticies Graph On-disk linked lists
Training from Zinoviev Alexey 63
Flowers in your garden
Database Data model Query API Data storage system
Cassandra Column Family Thrift Memtable/SSTable
CouchDB Documents Map/Reduce Append-only-B-tree
Hbase Column Family Thrift, REST Memtable/SSTable on
HDFS
MongoDB Documents Cursor B-tree
Neo4j Edges/Verticies Graph On-disk linked lists
Riak Key/Value Nested hashes, Hash
REST
Training from Zinoviev Alexey 64
Whats about Query Languages
in these databases?
Training from Zinoviev Alexey 65
Mongo
16 mb
JavaScript at the bottom
2d, 3d, B-tree indexes
3.0 version is very hot
integration with Kafka
Training from Zinoviev Alexey 66
Cassandra
CQL
2 billions columns in row
No ACID, of course
Can spend all your RAM
JVM - based
Training from Zinoviev Alexey 67
Riak
Links to another keys
REST
Consistency level in
each query
Ring of nodes
Training from Zinoviev Alexey 68
Neo4j
Vertices, edges
ACID
REST API + Cypher
2d index
Not so good for
distributed data
Training from Zinoviev Alexey 69
And what should we choose?
Training from Zinoviev Alexey 70
Network Rule
Can your data be presented as network or graph?
If yes -> Neo4j
If no -> continue
Training from Zinoviev Alexey 71
BigData Rule
Do you have TB or PB of data?
If yes -> NoSQL
If no -> Maybe you should stay with SQL?
Training from Zinoviev Alexey 72
Easy Rule
Do you need in more complex operation than read/write by
key?
If yes -> continue
If no -> Riak, Memcached
Training from Zinoviev Alexey 73
Hierarchy Rule
Do you have nested data? Is R >> W in your system?
If yes -> MongoDB, CouchDB
If no -> continue
Training from Zinoviev Alexey 74
Hadoop Rule
Is Hadoop integration required? Can your data be presented
as flat table?
If yes -> Hbase
If no -> continue
Training from Zinoviev Alexey 75
Availability Rule
Do you need in high availability? Are you agree with
eventual consistency?
If yes -> Cassandra
If no -> continue
Training from Zinoviev Alexey 76
Case #5 : Go back to PostgreSQL
Training from Zinoviev Alexey 77
CASSANDRA
Training from Zinoviev Alexey 78
Cassandra Features
LINEAR SCALABILITY MASTER-LESS ARCHITECTURE TUNABLE CONSISTENCY
LOREM IPSUM DOLOR AMET
Nulla nu nisi
Risus purus id fusce
MULTI
LobortisDATA
ipsum felis sed
CENTER SUPPORT MASSIVE WRITE AND FAST RESPONSE FLEXIBLE DATA MODEL
Training from Zinoviev Alexey 79
Cassandra Ring Architecture
Training from Zinoviev Alexey 80
Cassandra Replication
SimpleStrategy
NetworkTopologyStrategy
Training from Zinoviev Alexey 81
Cassandra Tunable Consistency
Write Consistency Read Consistency
ALL ALL
EACH_QUORUM EACH_QUORUM
QUORUM QUORUM
LOCAL_QUORUM
LOCAL_QUORUM
ONE
ONE
TWO
TWO
LOCAL_ONE
LOCAL_ONE
ANY
Training from Zinoviev Alexey 82
DATA MODEL
Training from Zinoviev Alexey 83
Data Model: Map<Map<Key, Value>>
Wide rows (2 billion cols)
Column names are data
Training from Zinoviev Alexey 84
KeySpace
CREATE KEYSPACE Excelsior WITH REPLICATION ={ 'class' :
'SimpleStrategy', 'replication_factor' : 3 };
CREATE KEYSPACE "Excalibur" WITH REPLICATION ={'class' :
'NetworkTopologyStrategy', 'dc1' : 3, 'dc2' : 2};
Training from Zinoviev Alexey 85
Tables
CREATE TABLE users
( id int PRIMARY KEY, first_name text,
last_name text, age int, hobbies list<test>);
CREATE TABLE users
( id int, first_name text, last_name text,
age int, hobbies list<test>, PRIMARY KEY(id));
Training from Zinoviev Alexey 86
Cassandra CQL (KEYS)
No Use
NO joins Data denormalization
NO subqueries Data duplication
NO Group By functionality PK, Indexes
NO aggregation functions Counters,
CAS Transactions, Batches
Keys Types Advanced Data Structures
partition key
Simple primary key PRIMARY KEY (empID)
partition key clustering key
Compound primary key PRIMARY KEY (empID, subdeptID, depID)
partition key clustering key
Composite primary key PRIMARY KEY ((block_id, breed), color)
Training from Zinoviev Alexey 87
User-defined types
CREATE TYPE address (
street text,
city text,
zip_code int,
phones set<text> );
Training from Zinoviev Alexey 88
User-defined types
CREATE INDEX state_key ON users (state);
SELECT * FROM users WHERE gender = 'f' AND state = 'TX'
ALLOW FILTERING;
Training from Zinoviev Alexey 89
Tables
CREATE TABLE users (
user_id text PRIMARY KEY,
first_name text, last_name text,
emails set<text> );
INSERT INTO users (user_id, first_name, last_name, emails)
VALUES ('frodo', 'Frodo', 'Baggins',
{'[email protected]', '[email protected]'});
Training from Zinoviev Alexey 90
TTL
INSERT INTO clicks ( userid, url, date, name) VALUES (
3715e600-2eb0-11e2-81c1-0800200c9a66, 'http://apache.org',
'2013-10-09', 'Mary')
USING TTL 86400;
Training from Zinoviev Alexey 91
Ordering
CREATE TABLE timeseries (
event_type text,
insertion_time timestamp,
PRIMARY KEY (event_type, insertion_time))
WITH CLUSTERING ORDER BY (insertion_time DESC);
SELECT * FROM emp
WHERE empID IN (130,104) ORDER BY deptID DESC;
Training from Zinoviev Alexey 92
Slicing
//To retrieve events for the 12th of January 2014 between
3:50:00 and 4:37:30:
SELECT * FROM timeline WHERE day='12 Jan 2014' AND (hour,
min) >= (3, 50) AND (hour, min, sec) <= (4, 37, 30);
Training from Zinoviev Alexey 93
Batching
BEGIN BATCH
INSERT INTO purchases (user, balance) VALUES ('user1', -8)
IF NOT EXISTS;
INSERT INTO purchases (user, expense_id, amount,
description, paid) VALUES ('user1', 1, 8, 'bur', false);
APPLY BATCH;
Training from Zinoviev Alexey 94
Counters
CREATE TABLE counters.page_view_counts (counter_value
counter, url_name varchar, page_name varchar, PRIMARY KEY
(url_name, page_name) );
UPDATE counters.page_view_counts SET counter_value =
counter_value + 1 WHERE url_name='www.datastax.com' AND
page_name='home';
Training from Zinoviev Alexey 95
Lightweight Transactions
INSERT INTO users (login, email, name, login_count) VALUES
('jdoe', '
[email protected]', 'Jane Doe', 1) IF NOT EXISTS;
UPDATE users SET email = [email protected] WHERE login =
'jdoe' IF email =
[email protected];
Training from Zinoviev Alexey 96
JAVA DRIVER
Training from Zinoviev Alexey 97
DataStax Java Driver
DataStax developed a new protocol that doesn't have RPC
limitations (Asynchronous I/O)
Low-level API with simple mapping
Works with CQL3
QueryBuilder reminds CriteriaAPI
Accessor-annotated interfaces
Training from Zinoviev Alexey 98
Async
Query
Training from Zinoviev Alexey 99
Accessor
Training from Zinoviev Alexey 100
Do you want more adventures?
Training from Zinoviev Alexey 101
Other Cassandras OM
Achilles : well documented and provides transactions
Astyanax : connection pool, thread safety and pagination
Pelops : old project, good bycicle
PlayORM : strange but powerful thing
Easy-Cassandra : simple annotations + CRUD
Thrift as low level API
Training from Zinoviev Alexey 102
CASSANDRA INTERNALS
Training from Zinoviev Alexey 103
Cassandra write path (COMMITLOG)
Training from Zinoviev Alexey 104
Cassandra write path (MemTable)
Training from Zinoviev Alexey 105
Cassandra write path (SSTables)
Training from Zinoviev Alexey 106
Cassandra write path
Training from Zinoviev Alexey 107
Cassandra write path (FLUSH)
Training from Zinoviev Alexey 108
Cassandra write path (COMPACTION)
Training from Zinoviev Alexey 109
Cassandra READ path
Training from Zinoviev Alexey 110
Bloom
Filter
Training from Zinoviev Alexey 111
Case #6 : Strong Wall with Bloom Filters
Training from Zinoviev Alexey 112
Cassandra Virtual nodes
Training from Zinoviev Alexey 113
CASSANDRA TOOLBOX
Training from Zinoviev Alexey 114
# vim /etc/cassandra/conf/cassandra.yaml
cluster_name: 'cassandra_cluster
seeds: "192.168.10.1, 192.168.10.2, 192.168.10.3
listen_address: 192.168.10.1
# vim /etc/cassandra/cassandra-topology.properties
Config # Cassandra Node IP=Data Center:Rack
changes 192.168.10.1=dc1:rac1
192.168.10.2=dc1:rac1
192.168.10.3=dc1:rac1
# default for unknown nodes
default=DC1:r1
Training from Zinoviev Alexey 115
$ assandra
$ nodetool status
-- Address Load Tokens Owns (effective) Host ID Rack
UN 192.168.10.1 128.89 KB 256 62.4% 1a360230-95c7-44ec-a61f-f314e374da6e rack1
Check UN 192.168.10.2 89.97 KB 256 67.4% e92cc1f5-69e2-4fe0-bf4b-28c1bf5b0131 rack1
cluster UN 192.168.10.3 147.77 KB 256 70.2% 27f5040b-072f-4ec7-bb7c-62021a454e39 rack1
$ cqlsh 192.168.10.1
cqlsh> create keyspace data with
replication={'class':'SimpleStrategy','replication_factor':2};
cqlsh> use data;
Training from Zinoviev Alexey 116
DataStax Studio
Training from Zinoviev Alexey 117
NoSQL: Basics + Mongo
Alexey Zinovyev, Java/BigData Trainer in EPAM
Training from Zinoviev Alexey 118
MONGO WORLD
Training from Zinoviev Alexey 119
Case #7 : Customer wants MongoDB
Training from Zinoviev Alexey 120
Mongo
Driver
Training from Zinoviev Alexey 121
BSON (something like JSON)
Adds data types that JSON did not support (ISO Dates,
ObjectId, etc.)
Optimized for performance
Adds compression
Training from Zinoviev Alexey 122
MONGO TOOLBOX
Training from Zinoviev Alexey 123
Install
Training from Zinoviev Alexey 124
Install
Training from Zinoviev Alexey 125
Install
Training from Zinoviev Alexey 126
Robo 3T [Robomongo]
Training from Zinoviev Alexey 127
Demo with JavaDriver 3.3
Training from Zinoviev Alexey 128
Do you have an ORM framework?
Training from Zinoviev Alexey 129
ODM
Training from Zinoviev Alexey 130
Morphia
Object Document Mapper
Specified with annotations
Implemented with reflection
Runtime validation
Training from Zinoviev Alexey 131
Morphia advantages
Integrated with Spring, Guice and other DI frameworks
Lifecycle Method Annotations (@PrePersist, @PostLoad)
Built on top of Mongo Java Driver
More better than old-style queries by BSON-object
It has convenient Query API
Training from Zinoviev Alexey 132
Morphia
Entity
Training from Zinoviev Alexey 133
Morphia
Entity
Training from Zinoviev Alexey 134
Other Mongos OM
Jongo : mongo - shell queries in Java-code
EclipseLink : different support of different NoSQL
databases
MJORM : Google Code, XML mapping + MQL (SQL syntax
for Mongo data extracting)
DataNucleus : support many Js as JDO, JPA
Training from Zinoviev Alexey 135
Modern
Web App
Training from Zinoviev Alexey 136
Polyglot Persistance
Redis: Rapid access for reads and writes. No need to be
durable
RBDMS: Needs transactional updates and has tabular
structure.
Riak: Needs high availability across multiple locations.
Can merge inconsistent writes
Training from Zinoviev Alexey 137
Polyglot Persistance
Neo4j: Rapidly traverse links between friends and
ratings.
MongoDB: Lots of reads, infrequent writes. Powerful
aggregation mechanism.
Cassandra: Large-scale analytics on large cluster. High
volume of writes on multiple nodes
Training from Zinoviev Alexey 138
SPRING INTEGRATION
Training from Zinoviev Alexey 139
Spring
Data
Training from Zinoviev Alexey 140
Spring Data MongoDB
Templating : connection configs, collection lifecycle
(create, drop), Map/Reduce + Aggregation
Mapping: @Document, @Index, @Field
Repository support: geospatial queries, queries derived
from method signatures (at runtime)
Paging, sorting, CRUD operations
Training from Zinoviev Alexey 141
<mongo:mongo host="${mongo.host}" port="${mongo.port}">
<mongo:options
connections-per-host="${mongo.connectionsPerHost}
threads-allowed-to-block-for-connection
multiplier="${mongo.threadsAllowedToBlockForConnectionMultiplier}
connect-timeout="${mongo.connectTimeout}
max-wait-time="${mongo.maxWaitTime}
Mongo auto-connect-retry="${mongo.autoConnectRetry}
socket-keep-alive="${mongo.socketKeepAlive}
config socket-timeout="${mongo.socketTimeout}
slave-ok="${mongo.slaveOk}
write-number="1
write-timeout="0
write-fsync="true"/>
</mongo:mongo>
<mongo:db-factory dbname= "test" mongo-ref="mongo"/>
Training from Zinoviev Alexey 142
Spring &
MongoDB
Training from Zinoviev Alexey 143
Best Practice: SQL for NoSQL
Training from Zinoviev Alexey 144
Any questions?
Training from Zinoviev Alexey 145