Apache Cassandra: Database

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

Apache Cassandra

Database
Software Engineering Branch

👩 Database Administration class


BENATHMANE Lalia

Charfaoui Younes & Bourbai Ismail


2
Hello!
Today we’re going to present the
ins and outs of Cassandra database.

3
Our process is
easy
first second third last

Introduction Key Principles Demo Example Debate

Basic of NoSQL, cassandra and The Different aspects of the An demo illustrating basics of Strenghs and Weaknesses, and
the instalation process Cassandra database Cassandra query language some questions

4
1

Introduction
Let’s start with some definitions.
“ I don’t always use Cassandra,
But when I do, I denormalize
-Meme.

6
NoSQL Databases
A NoSQL database (Not Only SQL) is a database that provides
a mechanism to store and retrieve data other than the tabular
relations used in relational databases. These databases are
schema-free, support easy replication, have simple API,
eventually consistent, and can handle huge amounts of data.

7
NoSQL Databases
In general, they share the following features:

● Schema-free databases ● Open Source


● Easy replication support ● BASE (instead of ACID)
● Simple API ● Huge amount of data
● Distributed ● Horizontally scalable

8
Apache Cassandra
A distributed NoSQL database
system for managing large
amounts of structured data
across many commodity servers,
while providing highly available
service and no single point of
failure.

9
Caracteristics
Physical
Data security Data sharing
independence

Speed of Verification
Manipulability
access of integrity

Limitation
Cassandra support most of the General DBMS characteristics of the
roundness

10
The
Instalation
To strat using cassandra we need to set a
workplace for it first.

11
Requirements:
● The latest version of Java 8
● The latest version of Python 2.7 or 3.6
● Download the Software (DataStax Community Edition
for Apache Cassandra™)

12
13
Additional Tool:
You can use DataGrip for interacting with the database
instead of the CQLSH, but it does require a license key for
using it.

https://www.jetbrains.com/datagrip/

14
3 – Key Space Name

2 - Connection name

1 – Choose Cassandra

15
2

Key Principles
The “Must” Be understood of the cassandra
High CQL query
Performance interface

Key
Features Distributed
&
Column
oriented
Decentralized

This features makes the


Cassandra Empire !
Elastic Tunable
Scalability Consistency

Fault
Tolerance

17
Distributed &
Decentralized
● Distributed: Capable of
running on multiple machines
● Decentralized: No single point
of failure
● No master-slave issues due to
peer-to-peer architecture Read- and write-requests
(protocol "gossip") to any node

18
Elastic
Scalability
● Cassandra scales horizontally,
adding more machines that
have all or some of the data on
● Adding of nodes increase
performance throughput
linearly
Linearly scales to terabytes
● Decreasing and increasing the and petabytes of data
node count happen seamlessly
19
High Availability &
Fault Tolerance
High Availability?
● Multiple networked computers
operating in a cluster
● Facility for recognizing node
failures
No single point of failure
● Forward failing over requests due to the peer-to-peer
to another part of the system architecture

20
Column oriented
Key-Value Store R1 C1 Key C2 Key C3Key

C1 Value C3 Value C3 Value

● Data is stored in sparse R2 C4 Key C5 Key


…..

multidimensional hash tables C4 Value C5 Value

● A row can have multiple columns …….


not necessarily the same amount
of columns for each row
● Each row has a unique key, which No relations!
also determines partitioning

21
Cassandra Query Language

● “CQL 3 is the default and primary interface into the


Cassandra DBMS”
● Familiar SQL-like syntax that maps to Cassandras storage
engine and simplifies data modelling

“SQL-like” but NOT


relational SQL

22
Cassandra Query Language
CRETE TABLE songs ( SELECT * FROM songs
Id uuid PRIMARY KEY, title text, WHERE id = 'a3e64f8f...';
Album text, Artist text,
data blob ); SELECT * FROM songs ;

INSERT INTO songs (id, title, album, artist)


VALUES( 'a3e64f8f...', ‘Hazim ra3d', ‘Spacetoon', ‘Tarkan‘ );

23
Cassandra Query Language

INSERT INTO songs (id, title)


VALUES( 'a3e64f8f...', ‘Al Kanas');

This is Possible With Cassandra

😋
24
Cassandra Query Language
The resulting table in RDMBS is this:

id title artist album data

a3e64f8f… Hazim Ra3d Tarkan Spacetoon null

g617Dd23… Al Kanas null null null

25
Cassandra Query Language
The resulting table in Cassandra is this:

id title artist album data

a3e64f8f… Hazim Ra3d Tarkan Spacetoon

g617Dd23… Al Kanas

26
MySQL Comparision:
Statistics based on 50 GB Data

Cassandra MySQL

Average Write 0.12 ms ~300 ms

Average Read 15 ms ~350 ms

Stats provided by Authors using Facebook data.

27
And Much More…

28
The Data
Model
How the Database is Organized ?

29
Data Model
Cluster:
Cassandra database is distributed over several machines that operate
together. The outermost container is known as the Cluster. For failure
handling, every node contains a replica, and in case of a failure, the replica
takes charge. Cassandra arranges the nodes in a cluster, in a ring format, and
assigns data to them.

30
Data Model
Keyspace Column family Column
Outermost container Contains Super Basic data structures
for data (one or more columns or Columns with: key, value,
column families), like (but not both). timestamp
database in RDBMS.

31
🌏
Data Model
Keyspace

Column Family

Settings Column
Settings
key value timestamp

32
3

Demo
Example illustrating different part of CQL
Examples Using
CQL
The Following Slides will User Emails
demonstrate different cases with
different CQL interfaces like DDL, • Id • Id
DML etc.. • Name • email
• Phone
• Age

34
Interface DDL
• Type
DROP • Keyspace , Table
• Index , Trigger

• Type
Same as SQL, but with CREATE • Keyspace , Table
keyspaces and types • Index , Trigger
option added.
• Type
ALTER • Keyspace , Table
• Index , Trigger

35
Interface DML

SELECT INSERT
The DML Interface is
the Same With DML
Normal SQL DML

UPDATE DELETE

36
Interface DCL
USER

CREATE DROP ALTER

Create users (Roles), PERMISSION


give them permission,
GRANT REVOKE
and start using them.

VIEW

LIST USERS LIST PERMISSION


37
Interface TCL START OPERATIONS END

For multiple BEGIN


operations use the BATCH DMLs APPLY
BATCH command BATCH

38
Metadata
& Logging
How to see metadata and make logging in
Cassandra database ?

39
Metadata Using Describe
keyspace Describe keyspace name

Table Describe table keyspace__name .table_name

Others Describe keyspaces, tables, schema

40
Metadata Keyspace
Query the defined key spaces using the SELECT statement.

SELECT * FROM
system___schema._keyspaces

keyspace__name durable__writes replication

test True {'class': 'org.apache'}….

…… …… ……

41
Metadata Tables
Getting information about tables in the test keyspace.

SELECT * FROM system__schema.tables


WHERE keyspace_name = test';

keyspace__name table__name …….

test users ……..

…… …… ……

42
Metadata Columns
Getting information about columns in the users tables.

SELECT * FROM system_schema.columns


WHERE keyspace__name = test' AND table_name = 'users';

table__name column___name kind type …….

users age regular int ……..

…… …… …… …… ……

43
Logging with System.log
To see what is happening in the database, you can use the
system.log file in the Cassandra home to directory to track
creational query.

{CASSANDRA HOME}/utils/cassandra.logdir_IS_UNDEFINED/

Here is an Example

{CASSANDRA HOME}/utils/cassandra.logdir_IS_UNDEFINED/

44
Logging with System.log
Here is an Example

INFO [main] 2018-11-08 23:48:36,960


MigrationManager.java:302 - Create new Keyspace:
KeyspaceMetadata {name=system_traces,
params=KeyspaceParams {durable_writes=true,
replication=ReplicationParams
{class=org.apache.cassandra.locator.SimpleStrategy,
replication_factor=2 }

45
Logging with Tracing
It’s an option to activate in the Cassandra database

TRACING [ ON | OFF]

The result will be on different keyspace called system__traces. In


a table called events

USE system_traces;
SELECT * FROM events;

46
Logging with Tracing
Example:

INSERT INTO product(id , name) VALUES (UUID(), 'Hello');

Result:
Execute CQL3 query
Parsing insert into product(id , name) values(UUID(), 'Hello');
Preparing statement
……

47
4

Debate
Strength and weakness of Cassandra.
Strengths (1)
● Linear scale performance
The ability to add nodes without failures leads to predictable
increases In performance
● Supports multiple languages
Python, C#/.NET, C++, Ruby, Java, Go, and many more…
● Operational and developmental simplicity
There are no complex software tiers to be managed, so
administration duties are greatly simplified.

49
Strengths (2)
● Ability to deploy across data centers
Cassandra can be deployed across multiple, geographically
dispersed data centers
● Cloud availability
Installations in cloud environments
● Peer to peer architecture
Cassandra follows a peer-to-peer architecture, instead of
master-slave architecture

50
Strengths (3)
● Flexible data model
Supports modern data types with fast writes and reads
● Fault tolerance
Nodes that fail can easily be restored or replaced
● High Performance
Cassandra has demonstrated brilliant performance under
large sets of data

51
Strengths (4)
● Schema-free/Schema-less
In Cassandra, columns can be created at your will within the
rows. Cassandra data model is also famously known as a
schema-optional data model
● AP-CAP
Cassandra is typically classified as an AP system, meaning
that availability and partition tolerance are generally
considered to be more important than consistency in
Cassandra

52
Weaknesses (1)
Use Cases where is better to avoid using Cassandra
● If there are too many joins required to retrieve the data
● To store configuration data
● During compaction, things slow down and throughput
degrades
● Basic things like aggregation operators are not supported
● Range queries on partition key are not supported

53
Weaknesses (2)
Use Cases where is better to avoid using Cassandra
● If there are transactional data which require 100%
consistency
● Cassandra can update and delete data but it is not
designed to do so

54
Thanks!
Any questions?

55

You might also like