Cassandra Best Practices
Cassandra Best Practices
x
Cassandra Best practices
#axway
Axway API Management 7.5.x Cassandra Best practices
© 2016 Axway 1
Apache Cassandra
Overview
© 2016 Axway 2
Cassandra overview
What is Cassandra?
© 2016 Axway 3
Cassandra overview
Architecture Cluster with 4 Cassandra nodes
© 2016 Axway 4
Cassandra overview
Write process
© 2016 Axway 6
Cassandra overview
Replication factor - Definition If we go back to our previous example :
• Reading data is performed in parallel • The client is aware of every single node. It can ask every
single node (Every node can received a read question).
across a cluster. A user requests data
from any node (which becomes that • In this example, the node 4 do not have the right data.
The node 4 knows all nodes of the cluster. It will play the
user’s coordinator node ), with the user’s role of a coordinator.
query being assembled from one or more Node 1
nodes holding the necessary data. 5µs ack Primary
© 2016 Axway 8
Cassandra overview
Read - Process
Node 1
Primary
Read
Client
Node 4
Node 2
+ Copy of 1
Node 3
+ Copy of 1
© 2016 Axway 9
Apache Cassandra
Focus on
consistency level
© 2016 Axway 10
Consistency
Definition
• Consistency refers to how up-to-date and synchronized a row of Cassandra data is on all of its
replicas. Cassandra extends the concept of eventual consistency by offering tunable
consistency. For any given read or write operation, the client application decides how consistent
the requested data must be.
• Even at low consistency levels, Cassandra writes to all replicas of the partition key, even replicas
in other data centers. The consistency level determines only the number of replicas that need to
acknowledge the write success to the client application. Typically, a client specifies a
consistency level that is less than the replication factor specified by the keyspace. This practice
ensures that the coordinating server node reports the write successful even if some replicas are
down or otherwise not responsive to the write.
Resource: https://docs.datastax.com/en/cassandra/2.0/cassandra/dml/dmlAboutDataConsistency.html?hl=consistency
© 2016 Axway 11
Consistency (Write)
Definition Level definition
• The coordinator sends a write request to • One : A write must be written to the commit log and memtable of
all replicas that own the row being at least one replica node.
written. As long as all replica nodes are • Two : A write must be written to the commit log and memtable of
up and available, they will get the write at least two replica node.
regardless of the consistency level • Quorum* : A write must be written to the commit log and
specified by the client. The write memtable on a quorum of replica nodes across all data centers.
consistency level determines how many • Local_Quorum* : Strong consistency. A write must be written to
replica nodes must respond with a the commit log and memtable on a quorum of replica nodes in the
success acknowledgment in order for the same data center as the coordinator node. Avoids latency of inter-
data center communication.
write to be considered successful.
Success means that the data was written • All : A write must be written to the commit log and
memtable on all replica nodes in the cluster for that
to the commit log and the memtable as partition.
described in About writes.
Note* : Q=QUORUM (Q = N / 2 + 1) with N=Replication
factor
© 2016 Axway 12
Consistency (Write)
Example (1/3) Example (2/3)
• If we go back to our previous example (4 nodes / replication • If the write consistency level specified by the client is ONE (1), the
factor=3). The incoming write (1) will go to all 3 nodes (2) that first node to complete the write responds back to the coordinator
own the requested row : (3), which then proxies the success message back to the client (4). A
consistency level of ONE means that it is possible that 2 of the 3
replicas could miss the write if they happened to be down at the
time the request was made. If a replica misses a write, Cassandra
will make the row consistent later using one of its built-in repair
mechanisms: hinted handoff, read repair, or anti-entropy node repair.
Node 1
+Johnny
3 Node 1
1 5µs ack +Johnny
Write Write
Client
Write Johnny 2 Client
Node 4 Write Write Johnny 1 Write Write
Node 2 Write consistency level
+Johnny
= One Node 4 2
Write Write
Node 2
12µs ack 4
Node 3
+Johnny Write
Node 3
© 2016 Axway 13
Consistency (Write)
Example (3/3)
3 Node 1
5µs ack +Johnny
Client
Write Johnny Write
Write consistency level 1 Write
= All
12µs ack 3
Node 4 2
Node 2
Write +Johnny
12µs ack
Write
4
Node 3
500µs ack +Johnny
3
© 2016 Axway 14
Consistency (Read)
Definition
• There are three types of read requests that a • For a digest request the coordinator first contacts the
coordinator node can send to a replica: replicas specified by the consistency level. The coordinator
• A direct read request sends these requests to the replicas that are currently
• A digest request responding the fastest. The nodes contacted respond with
• A background read repair request a digest of the requested data; if multiple nodes are
contacted, the rows from each replica are compared in
memory to see if they are consistent. If they are not, then
• The coordinator node contacts one replica node with the replica that has the most recent data (based on the
a direct read request. Then the coordinator sends a timestamp) is used by the coordinator to forward the result
digest request to a number of replicas determined by back to the client.
the consistency level specified by the client. The
• To ensure that all replicas have the most recent version of
digest request checks the data in the replica node to
frequently-read data, the coordinator also contacts and
make sure it is up to date. Then the coordinator
sends a digest request to all remaining replicas. If compares the data from all the remaining replicas that own
any replica nodes have out of date data, a the row in the background. If the replicas are inconsistent,
background read repair request is sent. Read repair the coordinator issues writes to the out-of-date replicas to
requests ensure that the requested row is made update the row to the most recent values. This process is
consistent on all replicas. known as read repair. Read repair can be configured per
table for non-QUORUM consistency levels (using
read_repair_chance), and is enabled by default
© 2016 Axway 15
Consistency (Read)
Level definition
• One : Returns a response from the closest replica, as determined by the snitch. By default, a read repair runs in the background to
make the other replicas consistent.
• Two : Returns the most recent data from two of the closest replicas.
• Quorum* : Returns the record after a quorum of replicas from all data centers has responded.
• Local_Quorum* : Returns the record after a quorum of replicas in the current data center as the coordinator node has reported.
Avoids latency of inter-data center communication.
• All : Returns the record after all replicas have responded. The read operation will fail if a replica does not respond.
© 2016 Axway 16
Consistency (Read)
Example (1/2) Example (2/2)
• In a single data center cluster with a replication factor of 3, and a • In a single data center cluster with a replication factor of 3, and a
read consistency level of ONE (1), the closest replica for the read consistency level of QUORUM (1), 2 of the 3 replicas for the
given row is contacted to fulfill the read request (2). In the given row must respond to fulfill the read request (2&3). If the
background a read repair is potentially initiated : contacted replicas have different versions of the row, the replica
with the most recent version will return the requested data. In the
background, the third replica is checked for consistency with the
first two, and if needed, a read repair is initiated for the out-of-date
Node 1 replicas:
5µs ack Johnny 3
Node 1
5µs ack +Johnny
Client
Read Johnny 1 Read Read 2 Client
consistency level = One
Read Johnny
Read
3
Node 4
Read consistency level = 1 Read 12µs ack
Node 2 QUORUM
Johnny
Node 4 2
12µs ack Node 2
Read +Johnny
Node 3 12µs ack
Read
Johnny
4
Node 3
500µs ack +Johnny
© 2016 Axway 17
Consistency
Summary
2 nodes can still provide the data Only one node can still provide the data.
The Quorum can be achieved The Quorum can never be achieved
Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6
Node 7 … Node N Node 7 … Node N
Johnny Johnny Johnny Johnny Johnny Johnny Johnny Johnny Johnny Johnny Johnny Johnny
4 nodes can still provide the data Only 3 nodes can still provide the data
The Quorum can be achieved The Quorum can never be achieved
© 2016 Axway 18
Apache Cassandra
Additional definitions
© 2016 Axway 19
Seed Nodes
Definition
• The seed node designation has no purpose other than bootstrapping the
gossip process for new nodes joining the cluster.
• Cassandra nodes use this list of hosts to find each other and learn the
topology of the ring.
• To prevent problems in gossip communications, use the same list of seed
nodes for all nodes in a cluster.
• More than a single seed node per data center is recommended for fault
tolerance
• Example: cassandra.yaml
© 2016 Axway 20
Keyspace
Definition
© 2016 Axway 21
Cassandra
configuration for
API Mgt
Overview
© 2016 Axway 22
Cassandra usage for API Management
(*) Cassandra is optional for those, other data store options are available
© 2016 Axway 23
Cassandra configuration for API Management - overview
Regarding the previous chapters, elements that must be configured are :
© 2016 Axway 24
Best Practices (1/2)
© 2016 Axway 25
Best Practices (2/2)
© 2016 Axway 26
Cassandra configuration
for API Mgt
Single node deployment
© 2016 Axway 27
Single Node
© 2016 Axway 28
Single node configuration
Cassandra.yaml Keyspace and client configuration
• seed_provider: • Register Host
# Addresses of hosts that are deemed contact points. • Designate Admin Node Manager
# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring. You must change this if you are running • Configure API Gateway Instance
# multiple nodes!
- class_name: org.apache.cassandra.locator.SimpleSeedProvider • Configure Hector Client via Policy Studio
parameters:
# seeds is actually a comma-delimited list of addresses.
# Ex: "<ip1>,<ip2>,<ip3>"
- seeds: "127.0.0.1"
© 2016 Axway 29
Cassandra configuration for
API Mgt
3 Node Cluster Deployment
(Single Datacenter)
© 2016 Axway 30
3 Node Cluster
• DC 1:
API Gateway with API Gateway with
(or without API (or without API
Manager) Manager)
Cassandra DB Node 1: 192.168.147.127
Cassandra DB Node 2: 192.168.147.128
Cassandra DB Node 3: 192.168.147.129
Cassandra
Node 1
Cassandra Cassandra
Node 1 Node 2
© 2016 Axway 31
Each node configuration
Cassandra.yaml
• seed_provider:
# Addresses of hosts that are deemed contact points.
# Cassandra nodes use this list of hosts to find each other and learn
# the topology of the ring. You must change this if you are running
# multiple nodes!
- class_name: org.apache.cassandra.locator.SimpleSeedProvider
parameters:
# seeds is actually a comma-delimited list of addresses.
# Ex: "<ip1>,<ip2>,<ip3>"
- seeds: “192.168.147.127“ (all Cassandra instances should reference the same seed)
© 2016 Axway 32
Keyspace and client configuration
For the first API Gateway/Manager Update replication factor
• Register Host • Login to the Cassandra DB Node 1
# cd ../cassandra/bin
• Configure 1 APIGateway instance
# ./cqlsh <IP Address>
• Configure Hector Client via Policy Studio ./cqlsh 192.168.147.127
# Find keyspace
> DESCRIBE KEYSPACES;
Example: x8746e4a4_e423_40ac_95a7_4934215e4e5d_group_2
# Execute the following command to alter table
• Once configuration is deployed Cassandra > ALTER KEYSPACE
keyspace will be created x8746e4a4_e423_40ac_95a7_4934215e4e5d_group_2 WITH
• Install APIMGR on first gateway REPLICATION = {'class' : ‘SimpleStrategy', 'replication_factor :
• Update Read/Write Consistency Level to 3’};
QUORUM for KPS collections via Policy Studio # Exit cqlsh utility and run the following command on all cassandra
instances:
• Register remaining hosts and configure APIGateway
Instance --- AFTER replication factor is updated • Run nodetool repair
x8746e4a4_e423_40ac_95a7_4934215e4e5d_group_2 on all cassandra
instances.
© 2016 Axway 33
Keyspace and client configuration
For the other API Gateway/Manager
© 2016 Axway 34
Apache Cassandra
Reference
© 2016 Axway 35
Reference
Reference- /tmp --noexec Reference: JAVA_HOME
© 2016 Axway 36
Apache Cassandra
Tools
© 2016 Axway 37
Tools
http://www.ecyrd.com/cassandracalculator/
© 2016 Axway 38
Tools
DBeaver - Linux
© 2016 Axway 39
Apache Cassandra
To go further in
Cassandra understanding
© 2016 Axway 40
Cassandra: Components
Write process - Additional definitions
• Commit log: The commit log is a crash-recovery mechanism in Cassandra. Every write
operation is written to the commit log.
• Mem-table: A mem-table is a memory-resident data structure. After commit log, the data will
be written to the mem-table. Sometimes, for a single-column family, there will be multiple
mem-tables.
• SSTable: It is a disk file to which the data is flushed from the mem-table when its contents
reach a threshold value.
© 2016 Axway 41
Cassandra: Components
Write process (1/3)
Client 1 The client send the request to the node
Update users
Set firstname = ‘Patrick’ It is written into the commit log (written
Where id=‘pmcfadin’ 2 on the server disk). It is very fast.
id=‘pmcfadin’,
firstname = ‘Patrick’
© 2016 Axway 42
Resource: https://www.youtube.com/watch?v=B_HTdrTgGNs
Cassandra: Components
Write process (2/3)
Then the data is put on a memtable
Client 3 stored in memory
Update users
Set firstname = ‘Patrick’
Where id=‘pmcfadin’ 4 Acknowledgement to the client
id=‘pmcfadin’,
firstname = ‘Patrick’
© 2016 Axway 43
Resource: https://www.youtube.com/watch?v=B_HTdrTgGNs
Cassandra: Components
Write process (3/3)
The flush process writes out data into a
Client 5 file called SStable. It is flushed to disk.
It is not about random IO but sequential
Update users IO (sequential write). It is ordered by
Set firstname = ‘Patrick’ time.
Where id=‘pmcfadin’
© 2016 Axway 44
Resource: https://www.youtube.com/watch?v=B_HTdrTgGNs
Replication Strategies
Definition
• A replication strategy determines the nodes where replicas are placed. Two replication
strategies are available:
• SimpleStrategy :
• Use only for a single data center. SimpleStrategy places the first replica on a node determined by the
partitioner. Additional replicas are placed on the next nodes clockwise in the ring without considering
topology (rack or data center location).
• NetworkTopologyStategy
• Use when you have (or plan to have) your cluster deployed across multiple data centers. This strategy
specify how many replicas you want in each data center.
© 2016 Axway 45
Snitch
Definition Example
• SimpleSnitch
• A snitch determines which data centers and racks
nodes belong to. • The SimpleSnitch is used only for single-data center
deployments.
• The SimpleSnitch (default) is used only for single-data center
deployments. It does not recognize data center or rack
• They inform Cassandra about the network information and can be used only for single-data center
topology so that requests are routed efficiently deployments or single-zone in public clouds. It treats strategy
and allows Cassandra to distribute replicas by order as proximity, which can improve cache locality when
grouping machines into data centers and racks. disabling read repair.
• Specifically, the replication strategy places the • Using a SimpleSnitch, you define the keyspace to use
replicas based on the information provided by the SimpleStrategy and specify a replication factor.
new snitch.
• GossipingPropertyFileSnitch
• All nodes must return to the same rack and data
• Automatically updates all nodes using gossip when adding
center.
new nodes and is recommended for production.
• Cassandra does its best not to have more than one
• This snitch is recommended for production.
replica on the same rack (which is not necessarily a
physical location). • It uses rack and data center information for the local node
defined in the cassandra-rackdc.properties file and propagates
this information to other nodes via gossip.
• Referenced in cassandra.yaml
© 2016 Axway 46
• endpoint_snitch:SimpleSnitch
Internode communications (gossip)
Definition
• Cassandra uses a protocol called gossip to discover location and state information about the
other nodes participating in a Cassandra cluster.
• Gossip is a peer-to-peer communication protocol in which nodes periodically exchange state
information about themselves and about other nodes they know about.
• The gossip process runs every second and exchanges state messages with up to three other
nodes in the cluster.
• The nodes exchange information about themselves and about the other nodes that they have
gossiped about, so all nodes quickly learn about all other nodes in the cluster.
• A gossip message has a version associated with it, so that during a gossip exchange, older
information is overwritten with the most current state for a particular node.
© 2016 Axway 47
Thank
you!
© 2016 Axway 48