0% found this document useful (0 votes)

35 views55 pages

Cassandra Lesson - Data Model and CQL3

The document discusses Cassandra data modeling and queries using the Cassandra Query Language (CQL). It covers topics like keyspaces, column families, primary keys, secondary indexes, collections, batch operations, and more. The document also provides exercises to create keyspaces and tables, insert and query data, and alter schemas.

Uploaded by

Sam Bui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views55 pages

Cassandra Lesson - Data Model and CQL3

Uploaded by

Sam Bui

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Cassandra

Data Modelling and

Queries with CQL3
By Markus Klems
(2013)
Source: http://geek-and-poke.com
Data Model
● Keyspace (Database)
● Column Family (Table)
● Keys and Columns
A column

column_name

value

timestamp

the timestamp field is

used by Cassandra for conflict
resolution: “Last Write Wins”
Column family (Table)
partition key columns ...

email name tel

101
[email protected] otto 12345

email name tel tel2

103
[email protected] karl 6789 12233

name
104
linda
Table with standard PRIMARY KEY
CREATE TABLE messages (
msg_id timeuuid PRIMARY KEY,
author text,
body text
);
Table: Tweets
PRIMARY KEY
= msg_id

author body
9990
otto Hello World!

author body
9991
linda Hi, Otto
Table with compound PRIMARY KEY
CREATE TABLE timeline (
user_id uuid,
msg_id timeuuid,
author text,
body text,
PRIMARY KEY (user_id, msg_id)
);
“Wide-row” Table: Timeline
PRIMARY KEY = user_id + msg_id

partition key column

msg_id author body

103
9990 otto Hello World!

msg_id author body

103
9991 linda Hi @otto
Timeline Table is partitioned by user
and locally clustered by msg

103
msg_id ... 211
msg_id ...
9990 ... 8090 ...
103
msg_id ... 211
msg_id ...
9994 ... 8555 ...
104
msg_id ... 211
msg_id ...
9881 ... 9678 ...
104
msg_id ... 212
msg_id ...
9999 ... 9877 ...

Node A Node B
Comparison: RDBMS vs. Cassandra

RDBMS Data Design Cassandra Data Design

Users Table Users Table
user_id name email user_id name email

101 otto [email protected] 101 otto [email protected]

Tweets Table Tweets Table

tweet_id author_id body tweet_id author_id name body
9990 101 Hello! 9990 101 otto Hello!

Followers Table Follows Table Followed Table

follows_ followed
id user_id follows_list id followed_list
id _id
4321 104 101 104 [101,117] 101 [104,109]
Exercise: Launch 1 Cassandra Node

Data Center DC1 /etc/conf/cassandra.yaml

(Germany)
A~$ start service cassandra
seeds: A
rpc_address: 0.0.0.0
A

launch server A
Exercise: Start CQL Shell

Data Center DC1

(Germany)
A~$ cqlsh

A
Intro: CLI, CQL2, CQL3
● CQL is “SQL for Cassandra”
● Cassandra CLI deprecated, CQL2 deprecated
● CQL3 is default since Cassandra 1.2
$ cqlsh
● Pipe scripts into cqlsh
$ cat cql_script | cqlsh
● Source files inside cqlsh
cqlsh> SOURCE '~/cassandra_training/cql3/
01_create_keyspaces';
CQL3
● Create a keyspace
● Create a column family
● Insert data
● Alter schema
● Update data
● Delete data
● Apply batch operation
● Read data
● Secondary Index
● Compound Primary Key
● Collections
● Consistency level
● Time-To-Live (TTL)
● Counter columns
● sstable2json utility tool
Create a SimpleStrategy keyspace
● Create a keyspace with SimpleStrategy and
"replication_factor" option with value "3" like
this:

cqlsh> CREATE KEYSPACE <ksname>

WITH REPLICATION =
{'class':'SimpleStrategy',
'replication_factor':3};
Exercise: Create a SimpleStrategy
keyspace
● Create a keyspace "simpledb" with SimpleStrategy and
replication factor 1.
Exercise: Create a SimpleStrategy
keyspace

cqlsh> CREATE KEYSPACE simpledb

WITH REPLICATION = {
'class' : 'SimpleStrategy',
'replication_factor' : 1 };
cqlsh> DESCRIBE KEYSPACE simpledb;
Create a NetworkTopologyStrategy
keyspace
Create a keyspace with NetworkTopologyStrategy and
strategy option "DC1" with a value of "1" and "DC2" with a
value of "2" like this:

cqlsh> CREATE KEYSPACE <ksname>

WITH REPLICATION = {
'class':'NetworkTopologyStrategy',
'DC1':1,
'DC2':2
};
Exercise Create Table “users”
● Connect to the "twotter" keyspace.
cqlsh> USE twotter;
● Create new column family (Table) named "users".
cqlsh:twotter> CREATE TABLE users (
id int PRIMARY KEY,
name text,
email text
);
cqlsh:twotter> DESCRIBE TABLES;
cqlsh:twotter> DESCRIBE TABLE users;

*we use int instead of uuid in the exercises for the sake of readability
Exercise: Create Table "messages"
● Create a new Table named "messages" with the
attributes "posted_on", "user_id", "user_name", "body",
and a primary key that consists of "user_id" and
"posted_on".
cqlsh:twotter> CREATE TABLE messages (
posted_on bigint,
user_id int,
user_name text,
body text,
PRIMARY KEY (user_id, posted_on)
);
*we use bigint instead of timeuuid in the exercises for the sake of readability
Exercise: Insert data into Table
"users" of keyspace "twotter"
cqlsh:twotter>
INSERT INTO users(id, name, email)
VALUES (101, 'otto', '[email protected]');

cqlsh:twotter> ... insert more records ...

cqlsh> SOURCE
'~/cassandra_training/cql3/03_insert';
Exercise: Insert message records

cqlsh:twotter>
INSERT INTO messages (user_id, posted_on,
user_name, body)
VALUES (101, 1384895178, 'otto', 'Hello!');

cqlsh:twotter> SELECT * FROM messages;

Read data
cqlsh:twotter> SELECT * FROM users;

cqlsh:twotter> DELETE FROM users

WHERE id = 106;
Delete data

cqlsh:twotter>
CREATE INDEX email_index ON users(email);

cqlsh:twotter> SELECT name, email FROM

users WHERE name = 'otto';

cqlsh:twotter> SELECT name, email FROM

users WHERE email = '[email protected]';
Alter Table Schema
cqlsh:twotter>
ALTER TABLE users ADD password text;

cqlsh:twotter>
ALTER TABLE users
ADD password_reset_token text;

* Given its flexible schema, Cassandra’s CQL ALTER finishes much

quicker than RDBMS SQL ALTER where all existing records need to be
updated.
Alter Table Schema
id | email | name | password | password_reset_token
-----+----------------+---------+----------+----------------------
107 | [email protected] | john | null | null
108 | [email protected] | michael | null | null
104 | [email protected] | linda | null | null
102 | [email protected] | jane | null | null
101 | [email protected] | otto | null | null
103 | null | karl | null | null
Collections - Set
CQL3 introduces collections for storing complex data
structures, namely the following: set, list, and map. This is
the CQL way of modelling many-to-one relationships.
1. Let us add a set of "hobbies" to the Table "users".
cqlsh:twotter> ALTER TABLE users ADD hobbies
set<text>;
cqlsh:twotter> UPDATE users SET hobbies =
hobbies +
{'badminton','jazz'} WHERE id = 101;
Collections - List
2. Now create a Table "followers" with a list of followers.
cqlsh:twotter> CREATE TABLE followers (
user_id int PRIMARY KEY,
followers list<text>);
cqlsh:twotter> INSERT INTO followers (
user_id, followers)
VALUES (101, ['willi','heinz']);

cqlsh:twotter> SELECT * FROM followers;

user_id | followers
---------+--------------------
101 | ['willi', 'heinz']
Collections - Map
3. Add a map to the Table "messages".
cqlsh:twotter>
ALTER TABLE messages
ADD comments map<text, text>;

cqlsh:twotter>
UPDATE messages
SET comments = comments + {'otto':'thx!'}
WHERE user_id = 103
AND posted_on = 1384895223;
Consistency Level
● Set the consistency level for all subsequent requests:
cqlsh:twotter> CONSISTENCY ONE;
cqlsh:twotter> CONSISTENCY QUORUM;
cqlsh:twotter> CONSISTENCY ALL;

● Show the current consistency level:

cqlsh:twotter> CONSISTENCY;
Exercise: Consistency Level
● Set the consistency level to ANY and execute a SELECT
statement.
Exercise: Consistency Level
● Set the consistency level to ANY and execute a SELECT
statement.

Bad Request: ANY ConsistencyLevel is only

supported for writes
Exercise: Time-To-Live (TTL)
● Insert a user record with a password reset
token with a 77 second TTL value.
cqlsh:twotter>
INSERT INTO users (id, name,
password_reset_token)
VALUES (109, 'timo', 'abc-xyz-123')
USING TTL 77;
Exercise: Time-To-Live (TTL)
● The INSERT statement before will delete the
entire user record after 77 seconds.
● This is what we actually wanted to do:
cqlsh:twotter> INSERT INTO users (id, name)
VALUES(110, 'anna');
cqlsh:twotter> UPDATE users USING TTL 77
SET password_reset_token =
'abc-xyz-123'
WHERE id = 110;
Time-To-Live (TTL)
● Check the TTL expiration time in seconds.

cqlsh:twotter> SELECT TTL

(password_reset_token)
FROM messages
WHERE user_id = 110;
Counter Columns
Create a Counter Column Table that counts
"upvote" and "downvote" events.

cqlsh:twotter> CREATE TABLE votes (

user_id int,
msg_created_on bigint,
upvote counter,
downvote counter,
PRIMARY KEY (user_id, msg_created_on)
);
Counter Columns
cqlsh:twotter>
UPDATE votes SET upvote = upvote + 1
WHERE user_id = 101
AND msg_created_on = 1234;

cqlsh:twotter>
UPDATE votes
SET downvote = downvote + 2
WHERE user_id = 101
AND msg_created_on = 1234;

cqlsh:twotter> SELECT * FROM votes;

sstable2json utility tool
$ sstable2json
var/lib/cassandra/data/twotter/users/*.db >
*.json

2X MB
.json file
X MB
.db file
Exercise: sstable2json
● Insert a few records
● Flush the users column family to disk and create a json
representation of a *.db file.
Solution: sstable2json
$ nodetool flush twotter users
$ sstable2json
/var/lib/cassandra/data/twotter/users/twotte
r-users-jb-1-Data.db > twotter-users.json
$ cat twotter-users.json
[{"key": "00000069","columns": [["","",
1384963716697000], ["email","[email protected]",
1384963716697000], ["name","gerd",
1384963716697000]]},
{"key": "00000068","columns": [["","",
1384963716685000], ...
CQL v3.1.0
(New in Cassandra 2.0)

● IF keyword
● Lightweight transactions (“Compare-And-Set”)
● Triggers (experimental!!)
● CQL paging support
● Drop column support
● SELECT column aliases
● Conditional DDL
● Index enhancements
● cqlsh COPY
IF Keyword

cqlsh> DROP KEYSPACE twotter;

cqlsh> DROP KEYSPACE twotter;
Bad Request: Cannot drop non existing
keyspace 'twotter'.
cqlsh> DROP KEYSPACE IF EXISTS twotter;
Lightweight Transactions
● Compare And Set (CAS)
● Example: without CAS, two users attempting
to create a unique user account in the same
cluster could overwrite each other’s work
with neither user knowing about it.

Source: http://www.datastax.com/documentation/cassandra/2.0/webhelp/cassandra/dml/dml_about_transactions_c.html
Lightweight Transactions
1. Register a new user
cqlsh:twotter>
INSERT INTO users (id, name, email)
VALUES (110, 'franz', '[email protected]')
IF NOT EXISTS;
2. Perform a CAS reset of Karl’s email.
cqlsh:twotter>
UPDATE users
SET email = '[email protected]'
WHERE id = 110
IF email = '[email protected]';
Exercise: Lightweight Transactions

● Perform a failing CAS e-mail reset:

cqlsh:twotter>
UPDATE users
SET email = '[email protected]'
...
Exercise: Lightweight Transactions

● Perform a failing CAS e-mail reset:

cqlsh:twotter>
UPDATE users
SET email = '[email protected]'
WHERE id = 110
IF email = '[email protected]';

[applied] | email
-----------+-------------
False | [email protected]
Exercise: Lightweight Transactions
● Write a password reset method by using an
expiring password_reset_token column and
a CAS password update query.
Exercise: Lightweight Transactions
cqlsh:twotter>
UPDATE users USING TTL 77
SET password_reset_token = 'abc-xyz-123'
WHERE id = 110;

cqlsh:twotter>
UPDATE users
SET password = 'geheim!'
WHERE id = 110
IF password_reset_token = 'abc-xyz-123';
Create a Trigger
(experimental feature)
● Triggers are written in Java.
● Triggers are currently an experimental feature
in Cassandra 2.0. Use with caution!

cqlsh:twotter>
CREATE TRIGGER myTrigger ON users USING 'org.
apache.cassandra.triggers.InvertedIndex'

Module 4 - Entity Relationship (ER) Modeling
No ratings yet
Module 4 - Entity Relationship (ER) Modeling
13 pages
CSG1207D Assignment Task 1 - Flights - Tri 1 2021 - ACBT
100% (1)
CSG1207D Assignment Task 1 - Flights - Tri 1 2021 - ACBT
4 pages
Cheat Sheet For Hana
100% (1)
Cheat Sheet For Hana
1 page
CSE 444 Practice Problems
No ratings yet
CSE 444 Practice Problems
13 pages
Learn Cassandra
100% (1)
Learn Cassandra
37 pages
Cassandra Presentation Final
100% (3)
Cassandra Presentation Final
71 pages
CON3632 Nanda-SQL Tuning Without Trying
No ratings yet
CON3632 Nanda-SQL Tuning Without Trying
18 pages
Databricks Delta Guide
No ratings yet
Databricks Delta Guide
11 pages
Data Modelling For Apache Cassandra: Datastax C Ollege Credit
No ratings yet
Data Modelling For Apache Cassandra: Datastax C Ollege Credit
41 pages
Actporpares_2Sum
No ratings yet
Actporpares_2Sum
3 pages
Cassandra Data Model
No ratings yet
Cassandra Data Model
17 pages
cql_cheatsheet
No ratings yet
cql_cheatsheet
20 pages
Presentation Cassandra Datastax
No ratings yet
Presentation Cassandra Datastax
151 pages
ABP W5-W6 Big Data Analytics Lab-CASSANDRA
No ratings yet
ABP W5-W6 Big Data Analytics Lab-CASSANDRA
12 pages
Introduction To Cassandra
No ratings yet
Introduction To Cassandra
47 pages
DS220-v6-Exercises
No ratings yet
DS220-v6-Exercises
33 pages
Reference Guide - P.1: True - False AND Durable - Writes True - False Keys USING Class - Name With Options Map
No ratings yet
Reference Guide - P.1: True - False AND Durable - Writes True - False Keys USING Class - Name With Options Map
7 pages
Cqlsh-20 Update
No ratings yet
Cqlsh-20 Update
9 pages
Crud Casaandra COURSERA
No ratings yet
Crud Casaandra COURSERA
6 pages
Apache Cassandra Query Language (CQL)
No ratings yet
Apache Cassandra Query Language (CQL)
1 page
Week-6
No ratings yet
Week-6
4 pages
02 Core SQL 4
No ratings yet
02 Core SQL 4
27 pages
Week-6
No ratings yet
Week-6
4 pages
Cassandra homework 1
No ratings yet
Cassandra homework 1
3 pages
Experiment 9 new
No ratings yet
Experiment 9 new
7 pages
Become a super modeler
No ratings yet
Become a super modeler
29 pages
Cassandra
No ratings yet
Cassandra
5 pages
Intro To Cassandra and CQL
No ratings yet
Intro To Cassandra and CQL
29 pages
Basic Part 1
No ratings yet
Basic Part 1
46 pages
Cassandradatamodeling 150520131838 Lva1 App6891
No ratings yet
Cassandradatamodeling 150520131838 Lva1 App6891
50 pages
Cassandra: Wa'el Belkasim Arash Akhlaghi Badrinath Jayakumar
No ratings yet
Cassandra: Wa'el Belkasim Arash Akhlaghi Badrinath Jayakumar
37 pages
M2hY7oj_QsuoWO6I_8LLcg_c38c41c4806b4cfb9e2e77cd911294bf_Weeks-1-and-2-Introduction-to-SQL-and-Single-Table-SQL
No ratings yet
M2hY7oj_QsuoWO6I_8LLcg_c38c41c4806b4cfb9e2e77cd911294bf_Weeks-1-and-2-Introduction-to-SQL-and-Single-Table-SQL
46 pages
Cassandra Tutorial
No ratings yet
Cassandra Tutorial
27 pages
Twissandra
No ratings yet
Twissandra
91 pages
DS220-v6-Solutions
No ratings yet
DS220-v6-Solutions
31 pages
SQL cheat sheet
No ratings yet
SQL cheat sheet
2 pages
DB 8
No ratings yet
DB 8
2 pages
4 Column-Family Stores Cassandra
No ratings yet
4 Column-Family Stores Cassandra
44 pages
PR 5 - No SQL
No ratings yet
PR 5 - No SQL
9 pages
8.2. CQL Exercises
100% (1)
8.2. CQL Exercises
16 pages
PPC2009 Mysql Pagination
No ratings yet
PPC2009 Mysql Pagination
26 pages
Unit 3
No ratings yet
Unit 3
5 pages
Database designing
No ratings yet
Database designing
14 pages
Pratham SQL 5
No ratings yet
Pratham SQL 5
20 pages
lec17
No ratings yet
lec17
21 pages
SQL and PostgreSQL The Complete Developer's Guide
No ratings yet
SQL and PostgreSQL The Complete Developer's Guide
5 pages
CS122D-Spring-2021-Midterm
No ratings yet
CS122D-Spring-2021-Midterm
20 pages
12 - CS - Set 05 - Ak
No ratings yet
12 - CS - Set 05 - Ak
9 pages
SQL Exercises: Bruce Momjian February 2004
No ratings yet
SQL Exercises: Bruce Momjian February 2004
2 pages
MySQL Cheat Sheet GitHub
100% (1)
MySQL Cheat Sheet GitHub
12 pages
Introduction To Linux II - Chapter 03 Exam Answers 2019 + PDF
No ratings yet
Introduction To Linux II - Chapter 03 Exam Answers 2019 + PDF
4 pages
Casandra
No ratings yet
Casandra
57 pages
Rangkum Handson
No ratings yet
Rangkum Handson
20 pages
02 CQL - Solution
No ratings yet
02 CQL - Solution
3 pages
MySQL Full Course internship
No ratings yet
MySQL Full Course internship
95 pages
sql
No ratings yet
sql
22 pages
Half Portion Test 2 Ans[XII] Ch 6, ,7, 8
No ratings yet
Half Portion Test 2 Ans[XII] Ch 6, ,7, 8
8 pages
Nosql Column-Family Stores
No ratings yet
Nosql Column-Family Stores
30 pages
Final Review
No ratings yet
Final Review
96 pages
Wide-Column Stores: Big Data Management Phil Bartie
No ratings yet
Wide-Column Stores: Big Data Management Phil Bartie
46 pages
MySQL Cheat Sheet & Quick Reference
No ratings yet
MySQL Cheat Sheet & Quick Reference
26 pages
The Essential R Reference
From Everand
The Essential R Reference
Mark Gardener
No ratings yet
SQL Server: Tips and Tricks - 1
From Everand
SQL Server: Tips and Tricks - 1
Priyanka Agarwal
5/5 (1)
Cassandra Query Language by Examples - Puzzles with Answers
From Everand
Cassandra Query Language by Examples - Puzzles with Answers
Cristian Scutaru
No ratings yet
Some Tutorials in Computer Networking Hacking
From Everand
Some Tutorials in Computer Networking Hacking
Dr. Hidaia Mahmood Alassouli
No ratings yet
TDD and Migration Document For OIC
No ratings yet
TDD and Migration Document For OIC
17 pages
College Management
No ratings yet
College Management
3 pages
The Language of SQL How to Access Data in Relational Databases 1st Edition Larry Rockoff download
No ratings yet
The Language of SQL How to Access Data in Relational Databases 1st Edition Larry Rockoff download
48 pages
A7-R5.1 English
No ratings yet
A7-R5.1 English
8 pages
Data Analytics Online Training
No ratings yet
Data Analytics Online Training
5 pages
ITAP3010_Developing_Data_Access_Solutions_Tutorial_A_Semester_1
No ratings yet
ITAP3010_Developing_Data_Access_Solutions_Tutorial_A_Semester_1
5 pages
Chap 3 Answers
100% (2)
Chap 3 Answers
3 pages
Subquery
No ratings yet
Subquery
5 pages
As Level Computer Application Databases
No ratings yet
As Level Computer Application Databases
187 pages
Which of The Following Is Not True About Writing SQL Statements
No ratings yet
Which of The Following Is Not True About Writing SQL Statements
32 pages
Question Bank Unit 4
No ratings yet
Question Bank Unit 4
4 pages
Talendopenstudio Components RG 5.4.0 en
100% (1)
Talendopenstudio Components RG 5.4.0 en
2,084 pages
Mobilink
No ratings yet
Mobilink
486 pages
Schema Refinement and Normal Forms: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
No ratings yet
Schema Refinement and Normal Forms: Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
20 pages
Dbms 2023
No ratings yet
Dbms 2023
10 pages
Ora12c NF PDF
No ratings yet
Ora12c NF PDF
36 pages
Normalization
No ratings yet
Normalization
20 pages
Create Table: Performing Operation On Table Data
No ratings yet
Create Table: Performing Operation On Table Data
6 pages
2 Interface Python With Mysql - Programs
No ratings yet
2 Interface Python With Mysql - Programs
2 pages
Dbms-Question Bank
0% (1)
Dbms-Question Bank
27 pages
Assignement 2
No ratings yet
Assignement 2
4 pages
DBMS External Practical
No ratings yet
DBMS External Practical
5 pages
1423120250428131843-ITC05 FINAL PROJECT GUIDELINES
No ratings yet
1423120250428131843-ITC05 FINAL PROJECT GUIDELINES
2 pages
TAFJ-Oracle Install 12c
100% (1)
TAFJ-Oracle Install 12c
17 pages
Data Management and Basics: Submitted To: Angammal Shanthi Ma'am Submitted By: Simran Gupta Prabir Das Himanshu Nain
No ratings yet
Data Management and Basics: Submitted To: Angammal Shanthi Ma'am Submitted By: Simran Gupta Prabir Das Himanshu Nain
4 pages

Cassandra Lesson - Data Model and CQL3

Uploaded by

Cassandra Lesson - Data Model and CQL3

Uploaded by

Cassandra

Data Modelling and

the timestamp field is

email name tel

email name tel tel2

partition key column

msg_id author body

msg_id author body

RDBMS Data Design Cassandra Data Design

101 otto [email protected] 101 otto [email protected]

Tweets Table Tweets Table

Followers Table Follows Table Followed Table

Data Center DC1 /etc/conf/cassandra.yaml

Data Center DC1

cqlsh> CREATE KEYSPACE <ksname>

cqlsh> CREATE KEYSPACE simpledb

cqlsh> CREATE KEYSPACE <ksname>

cqlsh:twotter> ... insert more records ...

cqlsh:twotter> SELECT * FROM messages;

cqlsh:twotter> DELETE FROM users

cqlsh:twotter> SELECT name, email FROM

cqlsh:twotter> SELECT name, email FROM

* Given its flexible schema, Cassandra’s CQL ALTER finishes much

cqlsh:twotter> SELECT * FROM followers;

● Show the current consistency level:

Bad Request: ANY ConsistencyLevel is only

cqlsh:twotter> SELECT TTL

cqlsh:twotter> CREATE TABLE votes (

cqlsh:twotter> SELECT * FROM votes;

cqlsh> DROP KEYSPACE twotter;

● Perform a failing CAS e-mail reset:

● Perform a failing CAS e-mail reset:

You might also like