Cassandra Introduction
Cassandra Introduction
Cassandra Introduction
Apache
1
Me
Robert Stupp
Freelancer, Coder, Architect
@snazy snazy@snazy.de
2
Agenda
Apache Cassandra History
Design Principles
Outstanding differences
CQL Intro
Access C*
Clusters
Cassandra Future
3
Apache Cassandra
History
4
Apache Cassandra
started at Facebook
inspired by
5
2.1 released in Sep 2014
6
Apache Cassandra
Design Principles
7
Hardware failures
can and will occur!
is to understand
Cassandra’s simplicity
9
Keep it simple
all nodes are equal
master-less architecture
no name nodes
10
Keep it running
during maintenance
11
Outstanding
Differences
12
Cassandra
Highly scalable
runs with a few nodes
up to 1000+ nodes cluster!
No SPOF
13
Cassandra @ Apple
14
Linear Scalability
15
Scaling Cassandra
More data?
-> add more nodes
Faster access?
-> add more nodes
16
Read / Write
performance
17
Durability
18
Availability @
Netflix
Chaos
Monkey
19
Availability @
Netflix
Chaos
Gorilla
20
Availability @
Netflix
Chaos
Kong
21
Availability @
Netflix
http://de.slideshare.net/planetcassandra/
active-active-c-behind-the-scenes-at-
netflix
22
32 node cluster (Rasperry PIs)
@DataStax
23
Most outstanding
Great documentation
Many presentations
Many videos
Regular webinars
24
Data Distribution
25
DHT
Data is organized in a
„Distributed Hash Table“
26
DHT
7 1
6 2
5 3
27
Replication
28
Replication Factor 2
Row A
0
7 1
6 2
Row B
5 3
29
Replication Factor 3
Row A
0
7 1
6 2
Row B
5 3
30
Consistency
31
Eventual consistency
is not
hopefully consistent
32
Consistency Levels
ANY (only for writes)
ONE, LOCAL_ONE,
SERIAL, LOCAL_SERIAL
33
Consistency
34
Write
Write
0
7 1
6 2
5 3
35
Write
Write
0
7 1
6 2
5 3
36
Mutli DC setup
DC 1 DC 2
37
Multi DC replication
Write
DC 1 DC 2
38
Mutli DC replication
Write
DC 1 DC 2
39
Mutli DC replication
Write
DC 1 DC 2
40
Replication &
Consistency
Define # of replicas
using replication factor
41
CQL Introduction
42
“CQL is SQL
minus joins,
minus subqueries,
plus collections”
(plus user types,
plus tuple types)
43
Why CQL?
Familiar syntax
Easy to understand
44
Data model
(hierarchical view)
Keyspace (schema)
Row
static columns
columns
45
CQL / DDL
Similar to SQL
CREATE TABLE …
ALTER TABLE …
DROP TABLE …
46
CQL / DML
Similar to SQL
INSERT …
UPDATE …
DELETE …
SELECT …
47
CQL / BATCH
Atomic operation
48
CQL types
boolean, int (32bit), bigint (64bit),
float, double,
decimal ("BigDecimal"),
varint ("BigInteger"),
49
CQL collection
types
list < foo >
51
CQL / user types
52
Cassandra
Data Modeling
Access by key
no access by arbitrary WHERE clause
Aggregate data
53
RDBMS modeling
54
C* modeling
55
Data Modeling
with RDBMS
Driven by
"What questions
do I have?"
57
Data Modeling
Basics
58
Data Modeling
http://de.slideshare.net/planetcassandra/
cassandra-day-sv-2014-fundamentals-
of-apache-cassandra-data-modeling
http://de.slideshare.net/planetcassandra/
data-modeling-with-travis-price
59
Accessing
Cassandra
60
Command Line
cqlsh
CQL shell
nodetool
node/cluster administration
61
GUI: DevCenter
62
Stress test?
63
DataStax APLv2
Open Source Drivers
for Java
for Python
for C#
https://github.com/datastax/
or http://www.datastax.com/download
64
Native protocol
Request multiplexing
65
Third Party Drivers
66
Mappers
67
Spark + Hadoop
68
Clusters
69
Cluster sizes
70
Cluster setup
71
Cluster experience
„Desaster proven“
Hurricanes
Amazon DC outages
72
Apache Cassandra
Future
73
Cassandra 3.0
(in development)
User Defined Functions
Subject
Aggregate functions to
change!!!
Functional indexes
74
Get active !
75
Cassandra Community
http://cassandra.apache.org/
http://planetcassandra.org/ - Blog
http://www.slideshare.net/
planetcassandra/presentations
http://de.slideshare.net/DataStax/
presentations
76
Cassandra Community
https://www.youtube.com/user/
PlanetCassandra
https://www.youtube.com/user/DataStax
http://www.datastax.com/dev/blog/
http://www.datastax.com/docs/
77
Free C* Training!
http://planetcassandra.org/cassandra-
training/
78
Get involved!
Ask questions,
submit RFEs or experiences to
user@cassandra.apache.org
79
Live Demo
User Defined Functions
80
C* 3.0 UDFs
81
C* 3.0 UDFs
Example
This is JavaScript!
82
UDFs for what?
Targeted for C* 3.0
83
Thanks
for your attention
Robert Stupp
@snazy
snazy@snazy.de
de.slideshare.net/RobertStupp
84
Q & A
85
86
BACKUP SLIDES
User-Defined-Functions
Demo
87
88
89
90
91
92
93
94
95
96
97
98
99