Nosql Cassandra Database: What Is Apache Cassandra?
Nosql Cassandra Database: What Is Apache Cassandra?
Apache Cassandra is highly scalable, distributed and high-performance NoSQL database. Cassandra is designed to handle a huge amount
of data.
In the image above, circles are Cassandra nodes and lines between the circles shows distributed architecture, while the client is sending
data to the node.
Cassandra handles the huge amount of data with its distributed architecture. Data is placed on different machines with more than one
replication factor that provides high availability and no single point of failure.
Cassandra History
Design Simplicity
Horizontal Scaling
High Availability
An easy way to remember this is to think of a machine on a server rack, we add more machines
across the horizontal direction and add more resources to a machine in the vertical direction.
In a database world horizontal-scaling is often based on partitioning of the data i.e. each node contains
only part of the data , in vertical-scaling the data resides on a single node and scaling is done through
multi-core i.e. spreading the load between the CPU and RAM resources of that machine.
With horizontal-scaling it is often easier to scale dynamically by adding more machines into the
existing pool - Vertical-scaling is often limited to the capacity of a single machine, scaling beyond that
capacity often involves downtime and comes with an upper limit.
A good example for horizontal scaling is Cassandra , MongoDB .. and a good example for vertical
scaling is MySQL - Amazon RDS (The cloud version of MySQL). It provides an easy way to scale
vertically by switching from small to bigger machines. This process often involves downtime.
https://stackoverflow.com/questions/11707879/difference-between-scaling-horizontally-and-vertically-
for-databases
Massively Scalable Architecture: Cassandra has a masterless design where all nodes are
at the same level which provides operational simplicity and easy scale out.
Masterless Architecture: Data can be written and read on any node.
Linear Scale Performance: As more nodes are added, the performance of Cassandra
increases.
No Single point of failure: Cassandra replicates data on different nodes that ensures no
single point of failure.
Fault Detection and Recovery: Failed nodes can easily be restored and recovered.
Flexible and Dynamic Data Model: Supports datatypes with Fast writes and reads.
Data Protection: Data is protected with commit log design and build in security like backup
and restore mechanisms.
Tunable Data Consistency: Support for strong data consistency across distributed
architecture.
Multi Data Center Replication: Cassandra provides feature to replicate data across multiple
data center.
Data Compression: Cassandra can compress up to 80% data without any overhead.
Cassandra Query language: Cassandra provides query language that is similar like SQL
language. It makes very easy for relational database developers moving from relational
database to Cassandra.
Messaging
Cassandra is a great database for the companies that provides Mobile phones and
messaging services. These companies have a huge amount of data, so Cassandra is best
for them.
Cassandra is a great database for the applications where data is coming at very high speed
from different devices or sensors.
Cassandra is used by many retailers for durable shopping cart protection and fast product
catalog input and output.
Social Media Analytics and recommendation engine
Cassandra is a great database for many online companies and social media providers for
analysis and recommendation to their customers.