Introduction To: Nosql
Introduction To: Nosql
Introduction To: Nosql
NOSQL
COMPUTER SCIENCE AND ENGINEERING
(DATA SCIENCE)
Presented by
V. Nagarjuna
HISTORY OF NOSQL
The term NoSQL was coined by Carlo Strozzi in the year 1998. He used this term to name his
Open Source, Light Weight, Database which did not have an SQL interface.
In the early 2009, when last.fm wanted to organize an event on open-source distributed
databases, Eric Evans, a Rackspace employee, reused the term to refer databases which are
non-relational, distributed, and does not conform to atomicity, consistency, isolation,
durability - four obvious features of traditional relational database systems.
In the same year, the "no:sql(east)" conference held in Atlanta, USA, NoSQL was discussed
and debated a lot.
And then, discussion and practice of NoSQL got a momentum, and NoSQL saw an
unprecedented growth.
HISTORY OF NOSQL
NOSQL……?
NoSQL is designed for distributed data stores where very large scale of data
storing needs (for example Google or Facebook which collects terabits of data
every day for their users). These type of data storing may not require fixed
In today’s time data is becoming easier to access and capture through third
parties such as Facebook, Google+ and others. Personal user information, social
graphs, geo location data, user-generated content and machine logging data are
just a few examples where the data has been increasing exponentially. To avail the
above service properly, it is required to process huge amount of data. Which SQL
This means that the data in the database remains consistent after the execution of an operation. For
example after an update operation all clients see the same data.
Availability :
This means that the system is always on (service guarantee availability), no downtime.
Partition Tolerance :
This means that the system continues to function even the communication among the servers is unreliable,
i.e. the servers may be partitioned into multiple groups that cannot communicate with one another.
CA - Single site cluster, all nodes are always in contact. When a partition occurs, the system blocks.
CP-Some data may not be accessible, but the rest is still consistent/accurate.
AP -System is still available under partitioning, but some of the data returned may be inaccurate.
NOSQL PROS/CONS
Advantages :
• High scalability
• Distributed Computing
• Lower cost
• Schema flexibility, semi-structure data
• No complicated Relationships
Disadvantages
• No standardization
• Limited query capabilities (so far)
THE BASE
The CAP theorem states that a distributed computer system cannot guarantee all of the
following three properties at the same time:
Consistency
Availability
Partition tolerance
A BASE system gives up on consistency.
o Basically Available indicates that the system does guarantee availability, in terms of
the CAP theorem.
o Soft state indicates that the state of the system may change over time, even without
input. This is because of the eventual consistency model.
o Eventual consistency indicates that the system will become consistent over time, given
that the system doesn't receive input during that time.
ACID VS BASE
ACID BASE
Atomic Basically Available
Durable
NOSQL CATEGORIES
There are four general types (most common categories) of NoSQL databases.
Each of these categories has its own specific attributes and limitations. There is not
a single solutions which is better than all the others, however there are some
databases that are better to solve specific problems. To clarify the NoSQL databases,
lets discuss the most common categories :
• Key-value stores
• Column-oriented
• Graph
• Document oriented
KEY-VALUE STORES
In the key-value storage, database stores data as hash table where each key is
unique and the value can be string, JSON, BLOB (Binary Large OBjec) etc.
KEY-VALUE STORES
A key may be strings, hashes, lists, sets, sorted sets and values are stored against these keys.
For example a key-value pair might consist of a key like "Name" that is associated with a
Key-Value stores follow the 'Availability' and 'Partition' aspects of CAP theorem.
Key-Values stores would work well for shopping cart contents, or individual values like
A graph data structure consists of a finite (and possibly mutable) set of ordered pairs,
called edges or arcs, of certain entities called nodes or vertices.
Rows Vertices
Joins Edges
Rows Documents
Mozilla
Adobe
Foursquare
Digg
McGraw-Hill Education