Cassandra Lesson - Data Model and CQL3
Cassandra Lesson - Data Model and CQL3
column_name
value
timestamp
name
104
linda
Table with standard PRIMARY KEY
CREATE TABLE messages (
msg_id timeuuid PRIMARY KEY,
author text,
body text
);
Table: Tweets
PRIMARY KEY
= msg_id
author body
9990
otto Hello World!
author body
9991
linda Hi, Otto
Table with compound PRIMARY KEY
CREATE TABLE timeline (
user_id uuid,
msg_id timeuuid,
author text,
body text,
PRIMARY KEY (user_id, msg_id)
);
“Wide-row” Table: Timeline
PRIMARY KEY = user_id + msg_id
103
msg_id ... 211
msg_id ...
9990 ... 8090 ...
103
msg_id ... 211
msg_id ...
9994 ... 8555 ...
104
msg_id ... 211
msg_id ...
9881 ... 9678 ...
104
msg_id ... 212
msg_id ...
9999 ... 9877 ...
Node A Node B
Comparison: RDBMS vs. Cassandra
(Germany)
A~$ start service cassandra
seeds: A
rpc_address: 0.0.0.0
A
launch server A
Exercise: Start CQL Shell
A
Intro: CLI, CQL2, CQL3
● CQL is “SQL for Cassandra”
● Cassandra CLI deprecated, CQL2 deprecated
● CQL3 is default since Cassandra 1.2
$ cqlsh
● Pipe scripts into cqlsh
$ cat cql_script | cqlsh
● Source files inside cqlsh
cqlsh> SOURCE '~/cassandra_training/cql3/
01_create_keyspaces';
CQL3
● Create a keyspace
● Create a column family
● Insert data
● Alter schema
● Update data
● Delete data
● Apply batch operation
● Read data
● Secondary Index
● Compound Primary Key
● Collections
● Consistency level
● Time-To-Live (TTL)
● Counter columns
● sstable2json utility tool
Create a SimpleStrategy keyspace
● Create a keyspace with SimpleStrategy and
"replication_factor" option with value "3" like
this:
*we use int instead of uuid in the exercises for the sake of readability
Exercise: Create Table "messages"
● Create a new Table named "messages" with the
attributes "posted_on", "user_id", "user_name", "body",
and a primary key that consists of "user_id" and
"posted_on".
cqlsh:twotter> CREATE TABLE messages (
posted_on bigint,
user_id int,
user_name text,
body text,
PRIMARY KEY (user_id, posted_on)
);
*we use bigint instead of timeuuid in the exercises for the sake of readability
Exercise: Insert data into Table
"users" of keyspace "twotter"
cqlsh:twotter>
INSERT INTO users(id, name, email)
VALUES (101, 'otto', '[email protected]');
cqlsh> SOURCE
'~/cassandra_training/cql3/03_insert';
Exercise: Insert message records
cqlsh:twotter>
INSERT INTO messages (user_id, posted_on,
user_name, body)
VALUES (101, 1384895178, 'otto', 'Hello!');
id | email | name
-----+--------------+-------
105 | [email protected] | gerd
104 | [email protected] | linda
102 | null | jane
106 | [email protected] | heinz
101 | [email protected] | otto
103 | null | karl
Update data
cqlsh:twotter> UPDATE users
SET email = '[email protected]'
WHERE id = 102;
id | email | name
-----+----------------+-------
105 | [email protected] | gerd
104 | [email protected] | linda
102 | [email protected] | jane
106 | [email protected] | heinz
101 | [email protected] | otto
103 | null | karl
Delete data
● Delete columns
cqlsh:twotter> DELETE email
FROM users
WHERE id = 105;
● Delete an entire row
id | email | name
-----+----------------+-------
105 | null | gerd
104 | [email protected] | linda
102 | [email protected] | jane
.
101 | [email protected] | otto
103 | null | karl
Batch operation
● Execute multiple mutations with a single operation
cqlsh:twotter>
BEGIN BATCH
INSERT INTO users(id, name, email)
VALUES(107, 'john','[email protected]')
INSERT INTO users(id, name)
VALUES(108, 'michael')
UPDATE users
SET email = '[email protected]'
WHERE id = 108
DELETE FROM users WHERE id = 105
APPLY BATCH;
Batch operation
id | email | name
-----+----------------+---------
.
107 | [email protected] | john
108 | [email protected] | michael
104 | [email protected] | linda
102 | [email protected] | jane
101 | [email protected] | otto
103 | null | karl
Secondary Index
cqlsh:twotter>
CREATE INDEX name_index ON users(name);
cqlsh:twotter>
CREATE INDEX email_index ON users(email);
cqlsh:twotter>
ALTER TABLE users
ADD password_reset_token text;
cqlsh:twotter>
UPDATE messages
SET comments = comments + {'otto':'thx!'}
WHERE user_id = 103
AND posted_on = 1384895223;
Consistency Level
● Set the consistency level for all subsequent requests:
cqlsh:twotter> CONSISTENCY ONE;
cqlsh:twotter> CONSISTENCY QUORUM;
cqlsh:twotter> CONSISTENCY ALL;
cqlsh:twotter> CONSISTENCY;
Exercise: Consistency Level
● Set the consistency level to ANY and execute a SELECT
statement.
Exercise: Consistency Level
● Set the consistency level to ANY and execute a SELECT
statement.
cqlsh:twotter>
UPDATE votes
SET downvote = downvote + 2
WHERE user_id = 101
AND msg_created_on = 1234;
2X MB
.json file
X MB
.db file
Exercise: sstable2json
● Insert a few records
● Flush the users column family to disk and create a json
representation of a *.db file.
Solution: sstable2json
$ nodetool flush twotter users
$ sstable2json
/var/lib/cassandra/data/twotter/users/twotte
r-users-jb-1-Data.db > twotter-users.json
$ cat twotter-users.json
[{"key": "00000069","columns": [["","",
1384963716697000], ["email","[email protected]",
1384963716697000], ["name","gerd",
1384963716697000]]},
{"key": "00000068","columns": [["","",
1384963716685000], ...
CQL v3.1.0
(New in Cassandra 2.0)
● IF keyword
● Lightweight transactions (“Compare-And-Set”)
● Triggers (experimental!!)
● CQL paging support
● Drop column support
● SELECT column aliases
● Conditional DDL
● Index enhancements
● cqlsh COPY
IF Keyword
Source: http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/dml/dml_about_transactions_c.html
Lightweight Transactions
1. Register a new user
cqlsh:twotter>
INSERT INTO users (id, name, email)
VALUES (110, 'franz', '[email protected]')
IF NOT EXISTS;
2. Perform a CAS reset of Karl’s email.
cqlsh:twotter>
UPDATE users
SET email = '[email protected]'
WHERE id = 110
IF email = '[email protected]';
Exercise: Lightweight Transactions
[applied] | email
-----------+-------------
False | [email protected]
Exercise: Lightweight Transactions
● Write a password reset method by using an
expiring password_reset_token column and
a CAS password update query.
Exercise: Lightweight Transactions
cqlsh:twotter>
UPDATE users USING TTL 77
SET password_reset_token = 'abc-xyz-123'
WHERE id = 110;
cqlsh:twotter>
UPDATE users
SET password = 'geheim!'
WHERE id = 110
IF password_reset_token = 'abc-xyz-123';
Create a Trigger
(experimental feature)
● Triggers are written in Java.
● Triggers are currently an experimental feature
in Cassandra 2.0. Use with caution!
cqlsh:twotter>
CREATE TRIGGER myTrigger ON users USING 'org.
apache.cassandra.triggers.InvertedIndex'