Unit 1 Notes in NoSQL
Unit 1 Notes in NoSQL
These are used for large sets of distributed data. There are some big
data performance issues which are effectively handled by relational
databases, such kind of issues are easily managed by NoSQL
databases. There are very efficient in analyzing large size unstructured
data that may be stored at multiple virtual servers of the cloud.
Why NoSQL?
NoSQL - probably the hottest term in database technology today -
was unheard of only a year ago. And yet, today, there are literally
dozens of database systems described as "NoSQL." How did all of
this happen so quickly?
The demands of big data and elastic provisioning call for a database
that can be distributed on large numbers of hosts spread out across
a widely dispersed network. While commercial relational databases
- such as Oracle's RAC - have taken steps to meet this challenge, it's
become apparent that some of the fundamental characteristics of
relational database are incompatible with the elastic and Big Data
demands.
Ironically, the demand for NoSQL did not come about because of
problems with the SQL language. The demand is due to the strong
consistency and transactional integrity of NoSQL. In a
transactional relational database, all users see an identical view of
data. In 2000, however, Eric Brewer outlined the now famous CAP
theorem, which states that both Consistency and high Availability
cannot be maintained when a database is Partitioned across a
fallible wide area network.
Within the NoSQL zoo, there are several distinct family trees. Some
NoSQL databases are pure key-stores without an explicit data
model, with many based on Amazon's Dynamo key-value store.
Others are heavily influenced by Google's BigTable database, which
supports Google products such as Google Maps and Google
Reader. Document databases store highly structured self-
describing objects, usually in an XML-like format called JSON.
Finally, graph databases store complex relationships such as those
found in social networks.
Introduction to NoSQL
NoSQL is a type of database management system (DBMS) that is
designed to handle and store large volumes of unstructured and semi-
structured data. Unlike traditional relational databases that use tables
with pre-defined schemas to store data, NoSQL databases use flexible
data models that can adapt to changes in data structures and are capable
of scaling horizontally to handle growing amounts of data.
The term NoSQL originally referred to “non-SQL” or “non-relational”
databases, but the term has since evolved to mean “not only SQL,” as
NoSQL databases have expanded to include a wide range of different
database architectures and data models.
Here are a few situations in which you can use a key-value database:-
User session attributes in an online app like finance or gaming, which
is referred to as real-time random data access.
Caching mechanism for repeatedly accessing data or key-based
design.
The application is developed on queries that are based on keys.
Features:
Advantages:
It is very easy to use. Due to the simplicity of the database, data can
accept any kind, or even different kinds when required.
Its response time is fast due to its simplicity, given that the remaining
environment near it is very much constructed and improved.
Key-value store databases are scalable vertically as well as
horizontally.
Built-in redundancy makes this database more reliable.
Disadvantages:
Here are some popular key-value databases which are widely used:
Couchbase: It permits SQL-style querying and searching for text.
Amazon DynamoDB: The key-value database which is mostly used is
Amazon DynamoDB as it is a trusted database used by a large
number of users. It can easily handle a large number of requests
every day and it also provides various security options.
Riak: It is the database used to develop applications.
Aerospike: It is an open-source and real-time database working with
billions of exchanges.
Berkeley DB: It is a high-performance and open-source database
providing scalability.
Basically, the relational database stores data in rows and also reads the
data row by row, column store is organized as a set of columns. So if
someone wants to run analytics on a small number of columns, one can
read those columns directly without consuming memory with the
unwanted data. Columns are somehow are of the same type and gain
from more efficient compression, which makes reads faster than before.
Examples of Columnar Data Model: Cassandra and Apache Hadoop
Hbase.
Electronic
02. Abhishek B-Tech 5
s
S.No. Name Course Branch ID
S.No. Name ID
01. Tanmay 2
02. Abhishek 5
03. Samriddha 7
04. Aditi 8
S.No. Course ID
01. B-Tech 2
02. B-Tech 5
03. B-Tech 7
04. B-Tech 8
S.No. Branch ID
01. Computer 2
02. Electronics 5
03. IT 7
04. E & TC 8
Columnar Data Model uses the concept of keyspace, which is like a
schema in relational models.
Advantages of Columnar Data Model :
In these data models, the nodes which are connected together are
connected physically and the physical connection among them is also
taken as a piece of data. Connecting data in this way becomes easy to
query a relationship. This data model reads the relationship from storage
directly instead of calculating and querying the connection steps. Like
many different NoSQL databases these data models don’t have any
schema as it is important because schema makes the model well and
good and easy to edit.
Examples of Graph Data Models :
JanusGraph: These are very helpful in big data analytics. It is a
scalable graph database system open source too. JanusGraph has
different features like:
Storage: Many options are available for storing graph data
like Cassandra.
Support for transactions: There are many supports available like
ACID (Atomicity, Consistency, Isolation, and Durability) which can
hold thousands of concurrent users.
Searching options: Complex searching options are available and
optional support too.
Neo4j: It stands for Network Exploration and Optimization 4 Java. As
the name suggests this graph database is written in Java with native
graph storage and processing. Neo4j has different features like:
Scalable: Scalable through data partitioning into pieces known as
shards.
Higher Availability: Availability is very much high due to continuous
backups and rolling upgrades.
Query Language: Uses programmer-friendly query language Cypher
graph query language.DGraph main features are:
DGraph: It is an open-source distributed graph database system
designed with scalability.
Query Language: It uses GraphQL, which is solely made for APIs.
open-source system: support for many open standards.
References
1. https://www.geeksforgeeks.org/columnar-data-model-of-nosql/
2. https://acs.dypvp.edu.in/NAAC/Database-Technology.pdf
3. https://www.studocu.com/in/document/visvesvaraya-
technological-university/nosql-databases/module-1-notes-nosql/
29646579
4.