App Ache
App Ache
●
Each node has some data associated with it and periodically gossips that data with
another node
●
Node A randomly selects a node B from a list of nodes known to it
●
A sends a message to B containing the data from A
●
B sends back a response containing its data
●
A and B update their data set by merging it with the received data
Partitioner:
It takes a call on how to distribute data on the various nodes in a cluster.
It determines the node on which to place the very first copy of the data. It is a hash
function to compute the token of partition/primary key.
Replication Factor:
It determines the number of copies of data that
will be stored across nodes in a cluster.
●
In order to achieve fault tolerance, a given piece of
data is replicated on one or more nodes.
●
A client can connect to any node in the cluster to
read data.
●
How many nodes will be read before responding to
the client is based on the consistency level
specified by the client. If the client specified
consistency is not met, the read operation blocks.
●
There is a possibility that few of the nodes may
respond with an out-of-data value. In such a case,
Cassandra will initiate a read repair operation to
bring the replicas with stale values up to date.
●
For repairing, Cassandra uses an AntiEntropy. It
means comparing all the replicas of each piece of
data that exist and updating each replica to the
newest version.
Writes in cassandra
Sequences of actions performed when client initiates a write
request
1) Data is first written to commit log(Data in the commit log is
purged after its corresponding data in the memtable is flushed to the
SSTable. The commit log is for recovering the data in memtable in the
event of a hardware failure.).
2) Push the write to a memory resident data structure called
memtable. A threshold value is defined in the memtable. When the
number of objects stored in the memtable reaches a threshold , the
contents of memtable are flushed to the disk in a file called SS
table(Sorted string table).
Hinted handoffs:
Hint will have the following information
1) Location of the node on which the replica is to
be placed
2) Version metadata
3) The actual data
Tunable consistency(In a distributed system, we
work with several servers in the system. Few of the
servers in one data center and others in other data
centers)
2 types of consistency
1) Strong consistency
2) Eventual consistency
Strong consistency:
When update on data is initiated by a client,
the update will propagate to all the nodes in the cluster
where that piece of data resides. Here, client can get
successful acknowledgement only after the update is
done in all the nodes where that piece of data resides.
Eventual consistency:
Client will receive successful
acknowledgement as soon as update is done on
one node.
Read consistency:
How many replicas must respond
before sending out the result to the client
application.
Write consistency:
A map gives efficient key lookup, and the sorted nature gives efficient scans.
In Cassandra, we can use row keys and column keys to do efficient lookups and range
scans. The number of column keys is unbounded. In other words, you can have wide rows.
●
Columns names in RDBMS are just strings. In Cassandra, they
can be long integers, UUIDs or any kind of byte array.
●
Columns in Cassandra actually have a third aspect: the
timestamp, which records the last time the column was
updated. This is not an automatic metadata property, however;
clients have to provide the timestamp along with the value
when they perform writes. You cannot query by the timestamp;
it is used purely for conflict resolution on the server side.
●
Rows do not have timestamps. Only each individual column
has a timestamp.
➔
Cassandra uses a storage structure similar to a Log-Structured Merge(LSM)
Tree, unlike a typical relational database that uses a B-Tree.
➔
Early Cassandra data model is schemaless. But starting with 0.7 release,
Cassandra allowed telling data types of columns at the time of defining
column family. So, Cassandra moved from Schemaless to Schema-optional.
➔
RDBMS allocate room for each column in each row, up front. In Cassandra's
storage engine, each row is sparse, For a given row, we store only the
columns present in that row.
➔
Thus, Cassandra gives you the flexibility normally associated with
schemaless systems, while also delivering the benefits of having a defined
schema.
Partition key:
The partition key is responsible for distributing data among nodes. A
partition key is the same as the primary key
Partition keys belong to a node. Cassandra is organized into a cluster of
nodes, with each node having an equal part of the partition key hashes.
Imagine we have a four node Cassandra cluster. In the example cluster
below, Node 1 is responsible for partition key hash values 0-24; Node 2 is
responsible for partition key hash values 25-49; and so on.
●
Consider a Cassandra database that stores information on CrossFit gyms.
One property of CrossFit gyms is that each gym must have a unique name
i.e. no two gyms are allowed to share the same name. The table below is
useful for looking up a gym when we know the name of the gym we’re
looking for.
CREATE TABLE crossfit_gyms (
gym_name text,
city text,
state_province text,
country_code text,
PRIMARY KEY (gym_name)
);
Now suppose we want to look up gyms by location. If we use the crossfit_gyms table, we’ll need to iterate
over the entire result set. Instead, we’ll create a new table that will allow us to query gyms by country .
●
Clustering keys are responsible for sorting data within a partition. Each primary
key column after the partition key is considered a clustering key. In the
crossfit_gyms_by_location example, country_code is the partition key;
state_province, city, and gym_name are the clustering keys.
●
Clustering keys are sorted in ascending order by default. So when we query for
all gyms in the United States, the result set will be ordered first by state_province
in ascending order, followed by city in ascending order, and finally gym_name in
ascending order.
Order by
To sort in descending order, add a WITH clause to the end of the CREATE TABLE
statement.
CREATE TABLE crossfit_gyms_by_location (
country_code text,
state_province text,
city text,
gym_name text,
PRIMARY KEY (country_code, state_province, city, gym_name)
) WITH CLUSTERING ORDER BY (state_province DESC, city ASC,
gym_name ASC);
Notice that we are no longer sorting on the partition key columns. Each
combination of the partition keys is stored in a separate partition
within the cluster.
●
For example, some people have a second phone number and some
don’t, and in an online form backed by Cassandra, there may be
some fields that are optional and some that are required. That’s OK.
Instead of storing null for those values we don’t know, which would
waste space, we just won’t store that column at all for that row
Super column family
●
A row in a super column family still contains columns, each
of which then contains subcolumns.
In Cassandra, The basic attributes that you can
set per keyspace are
●
Replication factor
●
Replica placement strategy
Column family Vs RDBMS Table
●
Although column families are defined, the columns are not.
●
You can freely add any column to any column family at any
time, depending on your needs.
●
A column family has two attributes: a name and a comparator.
The comparator value indicates how columns will be sorted
when they are returned to you in a query—according to long,
byte, UTF8, or other ordering.
●
Each column family is stored as a file in disk.
●
It’s an inherent part of Cassandra’s replica
design that all data for a single row must fit on a
single machine in the cluster. The reason for this
limitation is that rows have an associated row
key, which is used to determine the nodes that
will act as replicas for that row.
●
Further, the value of a single column cannot
exceed 2GB. Keep these things in mind as you
design your data model.
➔
First of all, when designing a relational database, you specify
the structure of the tables up front by assigning all of the
columns in the table a name; later, when you write data, you’re
simply supplying values for the predefined structure.
➔
But in Cassandra, you don’t define the columns up front; you
just define the column families you want in the keyspace, and
then you can start writing data without defining the columns
anywhere. That’s because in Cassandra, all of a column’s
names are supplied by the client. This adds considerable
flexibility to how your application works with data, and can
allow it to evolve organically over time.
Joins in Cassandra
●
You cannot perform joins in Cassandra. If you
have designed a data model and find that you
need something like a join, you’ll have to either
do the work on the client side, or create a
denormalized second column family that
represents the join results for you. This is
common among Cassandra users.
●
Performing joins on the client should be a very
rare case; you really want to duplicate
(denormalize) the data instead.
●
Column sorting
Columns have another aspect to their definition. In Cassandra, you
specify how column names will be compared for sort order when
results are returned to the client. Columns are sorted by the
“Compare With” type defined on their enclosing column family, and
you can choose from the following: AsciiType, BytesType,
LexicalUUIDType, IntegerType, LongType, TimeUUIDType, or
UTF8Type
●
Super column
To use a super column, you define your column family as type
Super. Then, you still have row keys as you do in a regular column
family, but you also reference the super column, which is simply a
name that points to a list or map of regular columns (sometimes
called the subcolumns).
Insert in cassandra:
It is not required to place values in all the
columns. But it is mandatory to specify all the
columns that make up the primary key.
➔
In a relational database, you could specify foreign keys in a table to
reference the primary key of a record in another table. But
Cassandra does not enforce this.
Guidelines Around Query Patterns
●
Before starting with data modeling in Cassandra, we should identify
the query patterns and ensure that they adhere to the following
guidelines:
●
Each query should fetch data from a single partition
We should keep track of how much data is getting stored in a
partition, as Cassandra has limits around the number of columns that
can be stored in a single partition
It is OK to denormalize and duplicate the data to support different
kinds of query patterns over the same data.
Start with your queries. Ask what queries your application will need,
and model the data around that instead of modeling the data first, as
you would in the relational world.
Secondary Index
Write a query to get the hotelID given a hotel name. The query has
to scan all records. It incurs a lot of time.
●
To avoid this, the relational answer to this is to create an index on
the name column, which acts as a copy of the data that the
relational database can look up very quickly. Because the hotelIDis
already a unique primary key constraint, it is automatically indexed,
and that is the primary index; for us to create another index on the
name column would constitute a secondary index, and Cassandra
does not currently support this.
●
To achieve the same thing in Cassandra, you create a second
column family that holds the lookup data. You create one column
family to store the hotel names, and map them to their IDs. The
second column family acts as an explicit secondary index.
●
Support for secondary indexes is currently being added to
Cassandra 0.7
Denormalization:
●
In relational database design, we are often
taught the importance of normalization.
●
This is not an advantage when working with
Cassandra because it performs best when the
data model is denormalized.
●
The important point is that instead of modeling
the data first and then writing queries, with
Cassandra you model the queries and let the
data be organized around them.
●
Think of the most common query paths your
application will use, and then create the column
families that you need to support them.
Decision Tree:
A decision tree is a diagram of nodes and connecting branches. Nodes
indicate decision points, chance events, or branch terminals. Branches
correspond to each decision alternative or event outcome emerging from
a node.
Example Problem: A company has to make the decision on whether to
manufacture Smoke+Fire detector or Motion detector.
Company knows that
●
It costs 1,00,000/- to develop Smoke+Fire detector and revenue will be
10,00,000/- and the chance of success is 50%
●
It costs 10,000/- to develop motion detector and revenue will be
4,00,000/- and the chance of success is 80%