0% found this document useful (0 votes)
38 views55 pages

App Ache

Uploaded by

babel 8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views55 pages

App Ache

Uploaded by

babel 8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 55

Apache Cassandra:

1) Apache Cassandra is a free and open-source


distributed (Distribute&Manage data loads across
multiple nodes in a cluster of commodity hardware)
NoSQL database management system designed to
handle large amounts of data across many commodity
servers, providing high availability with no single point of
failure.

2) It is column oriented database.


3) Cassandra is the right choice for managing
large amounts of structured, semi-structured, and
unstructured data across multiple data centers
and the cloud, when you need scalability and
high availability without compromising
performance.
Cassandra provides automatic data distribution across all nodes that participate in a “ring”
or database cluster.
There is nothing programmatic that a developer or administrator needs to do or code to
distribute data across a cluster because data is transparently partitioned across all nodes
in a cluster.
Avinash Lakshman (one of the authors of
Amazon's Dynamo) and Prashant Malik initially
developed Cassandra at Facebook to power the
Facebook inbox search feature.

Facebook released Cassandra as an open-source


project on Google code in July 2008. In March
2009 it became an Apache Incubator project.

Facebook developers named their database after


the Trojan mythological prophet Cassandra - with
classical allusions to a curse on an oracle.
When should Cassandra be used?
Here are some use cases where it would be the best choice over other
NoSQL databases.

In activity-tracking and monitoring applications: Numerous
entertainment and media organisations use Cassandra to monitor user
activity based on movies, music, albums, artists or other parameters.

In heavy write systems or in time-series-based applications:
Cassandra is perfect for very heavy write systems — for example, in
Web analytics where the data is logged for each request based on
hits, by type of browser, traffic sources, location, behaviour,
technology, devices, etc.

In social media analytics: Cassandra is used by many social media
providers to analyse the data and provide suggestions to their
customers.

In product catalogues and retail applications: A very popular use case
of Cassandra is to display fast product catalogue inputs and lookups,
and in retail applications.

Messaging: Cassandra serves as the database backbone for
numerous mobile phone and message providers applications.
It does not compromise on availability.

It is not based on master slave architecture but


based on peer-to-peer network. As it is not based
on master slave architecture, there is no single
point of failure

It is highly/massively scalable distributed database

It has adherence to the availability and partition


tolerance properties of CAP theorem. It takes care
of consistency through BASE(Basically Available
Soft State Eventual Consistency)
Features of Cassandra:
1) Peer-to-Peer network
2) Gossip and Failure Detection
3) Partitioner
4) Replication Factor
5) Anti-Entropy and Read Repair
6) Writes in Cassandra
7) Hinted Handoffs
8) Tunable Consistency
Peer-to-Peer network:
It ensures that data is distributed across all
nodes in the cluster. Each node exchanges
information across the cluster every second.
In case a node fails or is taken offline, it
definitely impacts the throughput.
It is a case of graceful degradation where
everything does not come crashing at any given
instant owing to a node failure.
Gossip and failure detection:
It is a peer-to-peer communication protocol. It eases the discovery and sharing of
location and state information with other nodes in the cluster. A primary use of gossip is
for information diffusion (spread): some event occurs, and our goal is to spread the word.

The operation of a gossip protocol is as follows.


Each node has some data associated with it and periodically gossips that data with
another node

Node A randomly selects a node B from a list of nodes known to it

A sends a message to B containing the data from A

B sends back a response containing its data

A and B update their data set by merging it with the received data

Partitioner:
It takes a call on how to distribute data on the various nodes in a cluster.
It determines the node on which to place the very first copy of the data. It is a hash
function to compute the token of partition/primary key.
Replication Factor:
It determines the number of copies of data that
will be stored across nodes in a cluster.

Data replication strategies:


Two replication strategies:
1) SimpleStrategy( It places replicas on
subsequent nodes in a clockwise order.This is
both rack unaware and data center unaware)
2) NetworkTopologyStrategy(This
method is both rack and data center aware)
Data partitioner determines coordinator node for each key. The
coordinator node is the first replica for a key which is also called primary
replica. If replication factor is N, a key's coordinator node replicates it to
other N-1 replicas.
In SimpleStrategy, successor nodes or the nodes on the ring immediate
following in clockwise direction to the coordinator node are selected as
replicas.
In NetworkTopologyStategy, nodes from distinct available racks in each
data center are chosen as replicas.
Anti-Entropy and read repair:


In order to achieve fault tolerance, a given piece of
data is replicated on one or more nodes.


A client can connect to any node in the cluster to
read data.


How many nodes will be read before responding to
the client is based on the consistency level
specified by the client. If the client specified
consistency is not met, the read operation blocks.

There is a possibility that few of the nodes may
respond with an out-of-data value. In such a case,
Cassandra will initiate a read repair operation to
bring the replicas with stale values up to date.


For repairing, Cassandra uses an AntiEntropy. It
means comparing all the replicas of each piece of
data that exist and updating each replica to the
newest version.
Writes in cassandra
Sequences of actions performed when client initiates a write
request
1) Data is first written to commit log(Data in the commit log is
purged after its corresponding data in the memtable is flushed to the
SSTable. The commit log is for recovering the data in memtable in the
event of a hardware failure.).
2) Push the write to a memory resident data structure called
memtable. A threshold value is defined in the memtable. When the
number of objects stored in the memtable reaches a threshold , the
contents of memtable are flushed to the disk in a file called SS
table(Sorted string table).
Hinted handoffs:
Hint will have the following information
1) Location of the node on which the replica is to
be placed
2) Version metadata
3) The actual data
Tunable consistency(In a distributed system, we
work with several servers in the system. Few of the
servers in one data center and others in other data
centers)
2 types of consistency
1) Strong consistency
2) Eventual consistency
Strong consistency:
When update on data is initiated by a client,
the update will propagate to all the nodes in the cluster
where that piece of data resides. Here, client can get
successful acknowledgement only after the update is
done in all the nodes where that piece of data resides.
Eventual consistency:
Client will receive successful
acknowledgement as soon as update is done on
one node.
Read consistency:
How many replicas must respond
before sending out the result to the client
application.

Write consistency:

How many replicas write must succeed


before sending out an acknowledgement to the
client application
Data model in cassandra:
The Cassandra data model is a schema-optional,
column-oriented data model.
Unlike a relational database, you do not need to
model all of the columns required by your
application up front, as each row is not required to
have the same set of columns.
For each column family, don’t think of a relational table. Instead, think of a nested sorted
map data structure.
In Cassandra, a table is a list of "nested key-value pairs". (Row x Column Key x Column
value). In RDBMS, a table is an array of arrays. (Row x Column)

A nested sorted map is a more accurate analogy than a relational table


Map<RowKey, SortedMap<ColumnKey, ColumnValue>>
SortedMap: provides a total ordering of its elements (elements can be traversed in
sorted order of keys). It ensures that the entries are maintained in an ascending key
order.
Map:It is a key-value pair.

A map gives efficient key lookup, and the sorted nature gives efficient scans.

In Cassandra, we can use row keys and column keys to do efficient lookups and range
scans. The number of column keys is unbounded. In other words, you can have wide rows.

Columns names in RDBMS are just strings. In Cassandra, they
can be long integers, UUIDs or any kind of byte array.


Columns in Cassandra actually have a third aspect: the
timestamp, which records the last time the column was
updated. This is not an automatic metadata property, however;
clients have to provide the timestamp along with the value
when they perform writes. You cannot query by the timestamp;
it is used purely for conflict resolution on the server side.


Rows do not have timestamps. Only each individual column
has a timestamp.

Cassandra uses a storage structure similar to a Log-Structured Merge(LSM)
Tree, unlike a typical relational database that uses a B-Tree.


Early Cassandra data model is schemaless. But starting with 0.7 release,
Cassandra allowed telling data types of columns at the time of defining
column family. So, Cassandra moved from Schemaless to Schema-optional.


RDBMS allocate room for each column in each row, up front. In Cassandra's
storage engine, each row is sparse, For a given row, we store only the
columns present in that row.


Thus, Cassandra gives you the flexibility normally associated with
schemaless systems, while also delivering the benefits of having a defined
schema.
Partition key:
The partition key is responsible for distributing data among nodes. A
partition key is the same as the primary key
Partition keys belong to a node. Cassandra is organized into a cluster of
nodes, with each node having an equal part of the partition key hashes.
Imagine we have a four node Cassandra cluster. In the example cluster
below, Node 1 is responsible for partition key hash values 0-24; Node 2 is
responsible for partition key hash values 25-49; and so on.

Consider a Cassandra database that stores information on CrossFit gyms.
One property of CrossFit gyms is that each gym must have a unique name
i.e. no two gyms are allowed to share the same name. The table below is
useful for looking up a gym when we know the name of the gym we’re
looking for.
CREATE TABLE crossfit_gyms (
gym_name text,
city text,
state_province text,
country_code text,
PRIMARY KEY (gym_name)
);

Now suppose we want to look up gyms by location. If we use the crossfit_gyms table, we’ll need to iterate
over the entire result set. Instead, we’ll create a new table that will allow us to query gyms by country .

CREATE TABLE crossfit_gyms_by_location


(
country_code text,
state_province text,
city text,
gym_name text,
PRIMARY KEY (country_code)
);
CREATE TABLE crossfit_gyms_by_location (
country_code text,
state_province text,
city text,
gym_name text,
PRIMARY KEY (country_code, state_province, city, gym_name)
);


Clustering keys are responsible for sorting data within a partition. Each primary
key column after the partition key is considered a clustering key. In the
crossfit_gyms_by_location example, country_code is the partition key;
state_province, city, and gym_name are the clustering keys.

Clustering keys are sorted in ascending order by default. So when we query for
all gyms in the United States, the result set will be ordered first by state_province
in ascending order, followed by city in ascending order, and finally gym_name in
ascending order.
Order by
To sort in descending order, add a WITH clause to the end of the CREATE TABLE
statement.
CREATE TABLE crossfit_gyms_by_location (
country_code text,
state_province text,
city text,
gym_name text,
PRIMARY KEY (country_code, state_province, city, gym_name)
) WITH CLUSTERING ORDER BY (state_province DESC, city ASC,
gym_name ASC);

The result set will now contain gyms ordered first by


state_province in descending order, followed by city in
ascending order, and finally gym_name in ascending order. You
must specify the sort order for each of the clustering keys in
the ORDER BY statement. The partition key is not part of the
ORDER BY statement because its values are hashed and therefore
won’t be close to each other in the cluster.
Composite key
Composite keys are partition keys that consist of multiple columns. The
crossfit_gyms_by_location example only used country_code for partitioning. The result is that all
gyms in the same country reside within a single partition. This can lead to wide rows. In the case
of our example, there are over 7,000 CrossFit gyms in the United States, so using the single
column partition key results in a row with over 7,000 combinations.
To avoid wide rows, we can move to a composite key consisting of additional columns. If we
change the partition key to include the state_province and city columns, the partition hash value
will no longer be calculated off only country_code. Now, each combination of country_code,
state_province, and city will have its own hash value and be stored in a separate partition within
the cluster. We accomplish this by nesting parenthesis around the columns we want included in
the composite key.
CREATE TABLE crossfit_gyms_by_city (
country_code text,
state_province text,
city text,
gym_name text,
opening_date timestamp,
PRIMARY KEY ((country_code, state_province, city), opening_date,
gym_name)
) WITH CLUSTERING ORDER BY ( opening_data ASC, gym_name ASC );

Notice that we are no longer sorting on the partition key columns. Each
combination of the partition keys is stored in a separate partition
within the cluster.

For example, some people have a second phone number and some
don’t, and in an online form backed by Cassandra, there may be
some fields that are optional and some that are required. That’s OK.
Instead of storing null for those values we don’t know, which would
waste space, we just won’t store that column at all for that row
Super column family

A row in a super column family still contains columns, each
of which then contains subcolumns.
In Cassandra, The basic attributes that you can
set per keyspace are

Replication factor

Replica placement strategy
Column family Vs RDBMS Table

Although column families are defined, the columns are not.

You can freely add any column to any column family at any
time, depending on your needs.

A column family has two attributes: a name and a comparator.
The comparator value indicates how columns will be sorted
when they are returned to you in a query—according to long,
byte, UTF8, or other ordering.

Each column family is stored as a file in disk.

It’s an inherent part of Cassandra’s replica
design that all data for a single row must fit on a
single machine in the cluster. The reason for this
limitation is that rows have an associated row
key, which is used to determine the nodes that
will act as replicas for that row.


Further, the value of a single column cannot
exceed 2GB. Keep these things in mind as you
design your data model.

First of all, when designing a relational database, you specify
the structure of the tables up front by assigning all of the
columns in the table a name; later, when you write data, you’re
simply supplying values for the predefined structure.


But in Cassandra, you don’t define the columns up front; you
just define the column families you want in the keyspace, and
then you can start writing data without defining the columns
anywhere. That’s because in Cassandra, all of a column’s
names are supplied by the client. This adds considerable
flexibility to how your application works with data, and can
allow it to evolve organically over time.
Joins in Cassandra

You cannot perform joins in Cassandra. If you
have designed a data model and find that you
need something like a join, you’ll have to either
do the work on the client side, or create a
denormalized second column family that
represents the join results for you. This is
common among Cassandra users.


Performing joins on the client should be a very
rare case; you really want to duplicate
(denormalize) the data instead.

Column sorting
Columns have another aspect to their definition. In Cassandra, you
specify how column names will be compared for sort order when
results are returned to the client. Columns are sorted by the
“Compare With” type defined on their enclosing column family, and
you can choose from the following: AsciiType, BytesType,
LexicalUUIDType, IntegerType, LongType, TimeUUIDType, or
UTF8Type

Super column
To use a super column, you define your column family as type
Super. Then, you still have row keys as you do in a regular column
family, but you also reference the super column, which is simply a
name that points to a list or map of regular columns (sometimes
called the subcolumns).
Insert in cassandra:
It is not required to place values in all the
columns. But it is mandatory to specify all the
columns that make up the primary key.

The columns that are missing do not occupy any


space on disk.

Columns in Cassandra actually have a third


aspect: the timestamp, which records thelast time
the column was updated

No Referential Integrity

Cassandra has no concept of referential integrity, and therefore has
no concept of joins.


In a relational database, you could specify foreign keys in a table to
reference the primary key of a record in another table. But
Cassandra does not enforce this.
Guidelines Around Query Patterns


Before starting with data modeling in Cassandra, we should identify
the query patterns and ensure that they adhere to the following
guidelines:


Each query should fetch data from a single partition
We should keep track of how much data is getting stored in a
partition, as Cassandra has limits around the number of columns that
can be stored in a single partition
It is OK to denormalize and duplicate the data to support different
kinds of query patterns over the same data.
Start with your queries. Ask what queries your application will need,
and model the data around that instead of modeling the data first, as
you would in the relational world.
Secondary Index
Write a query to get the hotelID given a hotel name. The query has
to scan all records. It incurs a lot of time.

To avoid this, the relational answer to this is to create an index on
the name column, which acts as a copy of the data that the
relational database can look up very quickly. Because the hotelIDis
already a unique primary key constraint, it is automatically indexed,
and that is the primary index; for us to create another index on the
name column would constitute a secondary index, and Cassandra
does not currently support this.

To achieve the same thing in Cassandra, you create a second
column family that holds the lookup data. You create one column
family to store the hotel names, and map them to their IDs. The
second column family acts as an explicit secondary index.

Support for secondary indexes is currently being added to
Cassandra 0.7
Denormalization:

In relational database design, we are often
taught the importance of normalization.

This is not an advantage when working with
Cassandra because it performs best when the
data model is denormalized.

The important point is that instead of modeling
the data first and then writing queries, with
Cassandra you model the queries and let the
data be organized around them.

Think of the most common query paths your
application will use, and then create the column
families that you need to support them.
Decision Tree:
A decision tree is a diagram of nodes and connecting branches. Nodes
indicate decision points, chance events, or branch terminals. Branches
correspond to each decision alternative or event outcome emerging from
a node.
Example Problem: A company has to make the decision on whether to
manufacture Smoke+Fire detector or Motion detector.
Company knows that

It costs 1,00,000/- to develop Smoke+Fire detector and revenue will be
10,00,000/- and the chance of success is 50%

It costs 10,000/- to develop motion detector and revenue will be
4,00,000/- and the chance of success is 80%

Which product to develop?


The first decision (root node):

Start by drawing a small square on the left side of a piece of paper.
This is called the root node, or root. The root node represents the
first set of decision alternatives.

For each decision alternative draw a line, or branch, extending to
the right from the root node.
Chance outcomes
Each product development effort can have one of two outcomes. Each
project can either succeed or fail.
Draw a small circle, or chance node, at the end of the branch for the
smoke and fire detector. Draw a chance node at the end of the branch for
the motion detector.
From each chance node, draw two branches toward the right; one branch
represents success and the other represents failure. Label the branches
accordingly
Endpoints and payoffs
You can now complete all the branches with endpoints,
since there is no further branch information to represent.

Draw a small triangle at the end of each branch to
represent the endpoint.

Write the payoff value at the endpoint. In business
applications the payoff is usually a monetary value equal
to the anticipated net profit, or return on investment.

Net profit (or net loss) is the difference between the
investment cost and the total revenue.

A positive value indicates a net profit, while a negative
value indicates a net loss. In other words, if revenue
exceeds investment, then the effort is profitable.
Otherwise the effort is a net loss, or a breakeven result if
the payoff is zero.
Incorporate uncertainty (outcome probability)
You can now incorporate the relative outcome probability, or uncertainty,
associated with each chance event. You can express probabilities as
percentages or as decimal fractions.

Find the expected value (EV)


You are now ready to evaluate the relative merits of each decision
alternative.
Expected value (EV) is the way to combine payoffs and probabilities for
each node. The higher the EV, the better a particular decision alternative
on average when compared to the other alternatives in the decision tree.
The method for calculating EV differs slightly based on the type of
node. In the last figure, we consider chance nodes first.

You calculate the EV for any chance node by summing together all
the EVs for each branch that is connected to the node.

The general formula for calculating EV at any chance node is:
EVchance node = [EVbranch1 + EVbranch2 + . . . + EVbranchN ]

if the smoke and fire detector is successful, the EV is the payoff
(profit) multiplied by its probability, or $900,000 x 0.5 = $450,000.
The EV if the fire detector project fails is (-$100,000) x 0.5 = (-
$50,000).

The EV for the decision to develop the smoke and fire detector
(incorporating both success and failure) is the sum of the EV for all
the eventualities.
EVchance node = (EVsuccess + EVfailure) = $450,000 + (-$50,000)
= $400,000.

Similarly, the EV for the decision to develop the motion detector is
given by EV = ($390,000 x 0.8) + [(-$10,000) x 0.2] = $310,000.
The smoke and fire detector project has a higher EV than the motion
detector. You can report the analysis with these summarized presentation
points:

The smoke and fire detector is the better project to develop, despite the
greater risk. The significantly larger anticipated profits make the risk more
acceptable than the competing project.

The motion detector is less risky, but also significantly less profitable. With
the given profit expectations the motion detector project does not
overcome the expected value of its rival project.

You might also like