0% found this document useful (0 votes)

208 views20 pages

Unit 1 Notes in NoSQL

NoSQL databases are used to handle large, distributed datasets and are effective for analyzing unstructured cloud data. They arose due to the demands of big data and cloud computing on relational databases. NoSQL databases sacrifice consistency for availability and scalability. There are several families of NoSQL databases including key-value, document, graph and BigTable influenced databases.

Uploaded by

sudhaaass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

208 views20 pages

Unit 1 Notes in NoSQL

Uploaded by

sudhaaass

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 20

NoSQL Databases

These are used for large sets of distributed data. There are some big
data performance issues which are effectively handled by relational
databases, such kind of issues are easily managed by NoSQL
databases. There are very efficient in analyzing large size unstructured
data that may be stored at multiple virtual servers of the cloud.

Why NoSQL?
NoSQL - probably the hottest term in database technology today -
was unheard of only a year ago. And yet, today, there are literally
dozens of database systems described as "NoSQL." How did all of
this happen so quickly?

Although the term "NoSQL" is barely a year old, in reality, most of

the databases described as NoSQL have been around a lot longer
than the term itself. Many databases described as NoSQL arose
over the past few years as reactions to strains placed on traditional
relational databases by two other significant trends affecting our
industry: big data and cloud computing.

Of course, database volumes have grown continuously since the

earliest days of computing, but that growth has intensified
dramatically over the past decade as databases have been tasked
with accepting data feeds from customers, the public, point of sale
devices, GPS, mobile devices, RFID readers and so on.

Cloud computing also has placed new challenges on the database.

The economic vision for cloud computing is to provide computing
resources on demand with a "pay-as-you-go" model. A pool of
computing resources can exploit economies of scale and a levelling
of variable demand by adding or subtracting computing resources as
workload demand changes. The traditional RDBMS has been unable
to provide these types of elastic services.

The demands of big data and elastic provisioning call for a database
that can be distributed on large numbers of hosts spread out across
a widely dispersed network. While commercial relational databases
- such as Oracle's RAC - have taken steps to meet this challenge, it's
become apparent that some of the fundamental characteristics of
relational database are incompatible with the elastic and Big Data
demands.

Ironically, the demand for NoSQL did not come about because of
problems with the SQL language. The demand is due to the strong
consistency and transactional integrity of NoSQL. In a
transactional relational database, all users see an identical view of
data. In 2000, however, Eric Brewer outlined the now famous CAP
theorem, which states that both Consistency and high Availability
cannot be maintained when a database is Partitioned across a
fallible wide area network.

Google, Facebook, Amazon and other huge web sites, therefore,

developed non-relational databases that sacrificed consistency for
availability and scalability. It just so happened that these
databases didn't support the SQL language either, and, when a
group of developers organized a meeting in June 2009 to discuss
these non-relational databases, the term "NoSQL" seemed
convenient. Perhaps unfortunately, the term NoSQL caught on
beyond expectations, and now is used as shorthand for any non-
relational database.

Within the NoSQL zoo, there are several distinct family trees. Some
NoSQL databases are pure key-stores without an explicit data
model, with many based on Amazon's Dynamo key-value store.
Others are heavily influenced by Google's BigTable database, which
supports Google products such as Google Maps and Google
Reader. Document databases store highly structured self-
describing objects, usually in an XML-like format called JSON.
Finally, graph databases store complex relationships such as those
found in social networks.

Within these four NoSQL families are at least a dozen database

systems of significance. Some probably will disappear as the
NoSQL segment matures, and, right now, it's anyone's guess as to
which ones will win, and which will lose.

NoSQL is a fairly imprecise term - it defines what the databases are

not, rather than what they are, and rejects SQL rather than the more
relevent strict consistency of the relational model. As imprecise
as the term may be, however, there's no doubt that NoSQL
databases represent an important direction in database technology.

The value of Relational Databases

NoSQL databases can store relationship data — they just
store it differently than relational databases do. In fact, when
compared with relational databases, many find modeling
relationship data in NoSQL databases to be easier than in
relational databases, because related data doesn't have to be
split between tables.
The Value of Relational Databases
1. Getting at Persistent Data
Two areas of memory:
• Fast, small, volatile main memory
• Larger, slower, non volatile backing store
• Since main memory is volatile to keep data around, we write it to a
backing store, commonly seen a disk which can be persistent memory.
The backing store can be:
• File system
• Database
 The database allows more flexibility than a file system in
storing large amounts of data in a way that allows an application
program to get information quickly and easily.
2. Concurrency
• Enterprise applications tend to have many people using same data at
once, possibly modifying that data. We have to worry about
coordinating interactions between them to avoid things like double
booking of hotel rooms.
• Since enterprise applications can have lots of users and other
systems all working concurrently, there’s a lot of room for bad
things to happen. Relational databases help to handle this by
controlling all access to their data through transactions.
3. Integration
• Enterprise requires multiple applications, written by different
teams, to collaborate in order to get things done. Applications often
need to use the same data and updates made through one application
have to be visible to others.
• A common way to do this is shared database integration where
multiple applications store their data in a single database.
• Using a single database allows all the applications to use each
others’ data easily, while the database’s concurrency control
handles multiple applications in the same way as it handles multiple
users in a single application.
4. A (Mostly) Standard Model
• Relational databases have succeeded because they provide the core
benefits in a (mostly) standard way.
• As a result, developers can learn the basic relational model and
apply it in many projects.
• Although there are differences between different relational
databases, the core mechanisms remain the same.
5. Impedance Mismatch
• For Application developers using relational databases, the biggest
frustration has been what’s commonly called the impedance mismatch:
the difference between the relational model and the in-memory data
structures.
• The relational data model organizes data into a structure of
tables. Where a tuple is a set of name-value pairs and a relation is
a set of tuples.
• The values in a relational tuple have to be simple—they cannot
contain any structure, such as a nested record or a list. This
limitation isn’t true for in-memory data structures, which can take
on much richer structures than relations.
• So if you want to use a richer in-memory data structure, you have
to translate it to a relational representation to store it on disk.
Hence the impedance mismatch—two different representations that
require translation.
The impedance mismatch lead to relational databases being replaced
with databases that replicate the inmemory data structures to disk.
That decade was marked with the growth of object-oriented programming
languages, and with them came object-oriented databases—both looking
to be the dominant environment for software development in the new
millennium. However, while object-oriented languages succeeded in
becoming the major force in programming, object-oriented databases
faded into obscurity.
• Impedance mismatch has been made much easier to deal with by the
wide availability of object relational mapping frameworks, such as
Hibernate and iBATIS that implement well-known mapping patterns, but
the mapping problem is still an issue.
• Relational databases continued to dominate the enterprise computing
world in the 2000s, but during that decade cracks began to open in
their dominance.
6. Application and Integration Databases
• In relational databases, the database acts as an integration
database—where multiple applications developed by separate teams
storing their data in a common database. This improves communication
because all the applications are operating on a consistent set of
persistent data. There are downsides to shared database integration.
• A structure that’s designed to integrate many applications is more
complex than any single application needs.
• If an application wants to make changes to its data storage, it
needs to coordinate with all the other applications using the
database.
• Different applications have different structural and performance
needs, so an index required by one application may cause a
problematic hit on inserts for another.
A different approach is to treat your database as an application
database—which is only accessed by a single application codebase
that’s looked after by a single team.
Advantages:
• With an application database, only the team using the application
needs to know about the database structure, which makes it much
easier to maintain and evolve the schema.
• Since the application team controls both the database and the
application code, the responsibility for database integrity can be
put in the application code.

Introduction to NoSQL
NoSQL is a type of database management system (DBMS) that is
designed to handle and store large volumes of unstructured and semi-
structured data. Unlike traditional relational databases that use tables
with pre-defined schemas to store data, NoSQL databases use flexible
data models that can adapt to changes in data structures and are capable
of scaling horizontally to handle growing amounts of data.
The term NoSQL originally referred to “non-SQL” or “non-relational”
databases, but the term has since evolved to mean “not only SQL,” as
NoSQL databases have expanded to include a wide range of different
database architectures and data models.

NoSQL databases are generally classified into four main

categories:

1. Document databases: These databases store data as semi-

structured documents, such as JSON or XML, and can be queried
using document-oriented query languages.
2. Key-value stores: These databases store data as key-value pairs,
and are optimized for simple and fast read/write operations.
3. Column-family stores: These databases store data as column
families, which are sets of columns that are treated as a single entity.
They are optimized for fast and efficient querying of large amounts of
data.
4. Graph databases: These databases store data as nodes and edges,
and are designed to handle complex relationships between data.
NoSQL databases are often used in applications where there is a high
volume of data that needs to be processed and analyzed in real-time,
such as social media analytics, e-commerce, and gaming. They can also
be used for other applications, such as content management systems,
document management, and customer relationship management.
However, NoSQL databases may not be suitable for all applications, as
they may not provide the same level of data consistency and
transactional guarantees as traditional relational databases. It is
important to carefully evaluate the specific needs of an application when
choosing a database management system.
NoSQL originally referring to non SQL or non relational is a database that
provides a mechanism for storage and retrieval of data. This data is
modeled in means other than the tabular relations used in relational
databases. Such databases came into existence in the late 1960s, but did
not obtain the NoSQL moniker until a surge of popularity in the early
twenty-first century. NoSQL databases are used in real-time web
applications and big data and their use are increasing over time.
 NoSQL systems are also sometimes called Not only SQL to
emphasize the fact that they may support SQL-like query languages.
A NoSQL database includes simplicity of design, simpler horizontal
scaling to clusters of machines and finer control over availability. The
data structures used by NoSQL databases are different from those
used by default in relational databases which makes some operations
faster in NoSQL. The suitability of a given NoSQL database depends
on the problem it should solve.
 NoSQL databases, also known as “not only SQL” databases, are a
new type of database management system that have gained
popularity in recent years. Unlike traditional relational databases,
NoSQL databases are designed to handle large amounts of
unstructured or semi-structured data, and they can accommodate
dynamic changes to the data model. This makes NoSQL databases a
good fit for modern web applications, real-time analytics, and big data
processing.
 Data structures used by NoSQL databases are sometimes also
viewed as more flexible than relational database tables. Many NoSQL
stores compromise consistency in favor of availability, speed and
partition tolerance. Barriers to the greater adoption of NoSQL stores
include the use of low-level query languages, lack of standardized
interfaces, and huge previous investments in existing relational
databases.
 Most NoSQL stores lack true ACID(Atomicity, Consistency, Isolation,
Durability) transactions but a few databases, such as MarkLogic,
Aerospike, FairCom c-treeACE, Google Spanner (though technically a
NewSQL database), Symas LMDB, and OrientDB have made them
central to their designs.
 Most NoSQL databases offer a concept of eventual consistency in
which database changes are propagated to all nodes so queries for
data might not return updated data immediately or might result in
reading data that is not accurate which is a problem known as stale
reads. Also some NoSQL systems may exhibit lost writes and other
forms of data loss. Some NoSQL systems provide concepts such as
write-ahead logging to avoid data loss.
 One simple example of a NoSQL database is a document database.
In a document database, data is stored in documents rather than
tables. Each document can contain a different set of fields, making it
easy to accommodate changing data requirements
 For example, “Take, for instance, a database that holds data
regarding employees.”. In a relational database, this information might
be stored in tables, with one table for employee information and
another table for department information. In a document database,
each employee would be stored as a separate document, with all of
their information contained within the document.
 NoSQL databases are a relatively new type of database management
system that have gained popularity in recent years due to their
scalability and flexibility. They are designed to handle large amounts
of unstructured or semi-structured data and can handle dynamic
changes to the data model. This makes NoSQL databases a good fit
for modern web applications, real-time analytics, and big data
processing.
Key Features of NoSQL :
1. Dynamic schema: NoSQL databases do not have a fixed schema
and can accommodate changing data structures without the need for
migrations or schema alterations.
2. Horizontal scalability: NoSQL databases are designed to scale out
by adding more nodes to a database cluster, making them well-suited
for handling large amounts of data and high levels of traffic.
3. Document-based: Some NoSQL databases, such as MongoDB, use
a document-based data model, where data is stored in semi-
structured format, such as JSON or BSON.
4. Key-value-based: Other NoSQL databases, such as Redis, use a
key-value data model, where data is stored as a collection of key-
value pairs.
5. Column-based: Some NoSQL databases, such as Cassandra, use a
column-based data model, where data is organized into columns
instead of rows.
6. Distributed and high availability: NoSQL databases are often
designed to be highly available and to automatically handle node
failures and data replication across multiple nodes in a database
cluster.
7. Flexibility: NoSQL databases allow developers to store and retrieve
data in a flexible and dynamic manner, with support for multiple data
types and changing data structures.
8. Performance: NoSQL databases are optimized for high performance
and can handle a high volume of reads and writes, making them
suitable for big data and real-time applications.
Advantages of NoSQL: There are many advantages of working with
NoSQL databases such as MongoDB and Cassandra. The main
advantages are high scalability and high availability.
1. High scalability : NoSQL databases use sharding for horizontal
scaling. Partitioning of data and placing it on multiple machines in
such a way that the order of the data is preserved is sharding. Vertical
scaling means adding more resources to the existing machine
whereas horizontal scaling means adding more machines to handle
the data. Vertical scaling is not that easy to implement but horizontal
scaling is easy to implement. Examples of horizontal scaling
databases are MongoDB, Cassandra, etc. NoSQL can handle a huge
amount of data because of scalability, as the data grows NoSQL scale
itself to handle that data in an efficient manner.
2. Flexibility: NoSQL databases are designed to handle unstructured or
semi-structured data, which means that they can accommodate
dynamic changes to the data model. This makes NoSQL databases a
good fit for applications that need to handle changing data
requirements.
3. High availability : Auto replication feature in NoSQL databases
makes it highly available because in case of any failure data replicates
itself to the previous consistent state.
4. Scalability: NoSQL databases are highly scalable, which means that
they can handle large amounts of data and traffic with ease. This
makes them a good fit for applications that need to handle large
amounts of data or traffic
5. Performance: NoSQL databases are designed to handle large
amounts of data and traffic, which means that they can offer improved
performance compared to traditional relational databases.
6. Cost-effectiveness: NoSQL databases are often more cost-effective
than traditional relational databases, as they are typically less complex
and do not require expensive hardware or software.
7. Agility: Ideal for agile development.
Disadvantages of NoSQL: NoSQL has the following disadvantages.
1. Lack of standardization : There are many different types of NoSQL
databases, each with its own unique strengths and weaknesses. This
lack of standardization can make it difficult to choose the right
database for a specific application
2. Lack of ACID compliance : NoSQL databases are not fully ACID-
compliant, which means that they do not guarantee the consistency,
integrity, and durability of data. This can be a drawback for
applications that require strong data consistency guarantees.
3. Narrow focus : NoSQL databases have a very narrow focus as it is
mainly designed for storage but it provides very little functionality.
Relational databases are a better choice in the field of Transaction
Management than NoSQL.
4. Open-source : NoSQL is open-source database. There is no reliable
standard for NoSQL yet. In other words, two database systems are
likely to be unequal.
5. Lack of support for complex queries : NoSQL databases are not
designed to handle complex queries, which means that they are not a
good fit for applications that require complex data analysis or
reporting.
6. Lack of maturity : NoSQL databases are relatively new and lack the
maturity of traditional relational databases. This can make them less
reliable and less secure than traditional databases.
7. Management challenge : The purpose of big data tools is to make
the management of a large amount of data as simple as possible. But
it is not so easy. Data management in NoSQL is much more complex
than in a relational database. NoSQL, in particular, has a reputation
for being challenging to install and even more hectic to manage on a
daily basis.
8. GUI is not available : GUI mode tools to access the database are not
flexibly available in the market.
9. Backup : Backup is a great weak point for some NoSQL databases
like MongoDB. MongoDB has no approach for the backup of data in a
consistent manner.
10. Large document size : Some database systems like MongoDB and
CouchDB store data in JSON format. This means that documents are
quite large (BigData, network bandwidth, speed), and having
descriptive key names actually hurts since they increase the document
size.
Types of NoSQL database: Types of NoSQL databases and the name
of the databases system that falls in that category are:
1. Graph Databases: Examples – Amazon Neptune, Neo4j
2. Key value store: Examples – Memcached, Redis, Coherence
3. Tabular: Examples – Hbase, Big Table, Accumulo
4. Document-based: Examples – MongoDB, CouchDB, Cloudant
When should NoSQL be used:
1. When a huge amount of data needs to be stored and retrieved.
2. The relationship between the data you store is not that important
3. The data changes over time and is not structured.
4. Support of Constraints and Joins is not required at the database level
5. The data is growing continuously and you need to scale the database
regularly to handle the data.
In conclusion, NoSQL databases offer several benefits over traditional
relational databases, such as scalability, flexibility, and cost-
effectiveness. However, they also have several drawbacks, such as a
lack of standardization, lack of ACID compliance, and lack of support for
complex queries. When choosing a database for a specific application, it
is important to weigh the benefits and drawbacks carefully to determine
the best fit.

Integration Databases in NoSQL

In NoSQL databases, integration databases refer to databases that
combine different types of NoSQL databases and/or traditional relational
databases to provide a comprehensive and flexible data storage solution.
This can help organizations to overcome some of the limitations of using
a single type of database and to take advantage of the strengths of
multiple database types.
Integration databases typically use a middleware layer to connect and
communicate between the different types of databases. The middleware
layer provides a uniform interface for applications to access data across
the different databases, which can simplify application development and
improve performance.
One of the benefits of integration databases is that they can help
organizations to use the best database type for each data storage
requirement. For example, some data may be best stored in a document
database, while other data may be best stored in a graph database. By
using an integration database, organizations can store all of their data in
a single place while taking advantage of the strengths of each database
type.
Another benefit of integration databases is that they can provide greater
scalability and reliability than using a single database. By using a
distributed database architecture, organizations can distribute their data
across multiple servers and data centers, which can improve
performance and provide greater resilience in the event of a hardware
failure or other issues.

Some popular integration databases in NoSQL include:

1. Apache Cassandra: A distributed database that is designed for

scalability and high availability, and supports multiple data models,
including column-family, document, and graph.
2. Apache Hadoop: A distributed data processing framework that
supports a variety of data sources, including HBase, Cassandra, and
MongoDB.
3. Apache Kafka: A distributed streaming platform that can be used to
integrate multiple data sources and to stream data to multiple
destinations.
4. Overall, integration databases in NoSQL can provide a powerful
solution for organizations that need to store and manage large
volumes of data across multiple data sources. By using a middleware
layer to connect and communicate between different types of
databases, organizations can take advantage of the strengths of each
database type and provide a flexible and scalable solution for their
data storage needs.

Nowadays, an enormous amount of information is been made each

second. This
information is of different schemes-unstructured, structured, and semi-
structured data. The variety and volume of this information can’t be
managed by traditional databases. Therefore, NoSQL frameworks have
emerged, which is another age of database frameworks.
To handle data that is heterogeneous, NoSQL databases are more
proficient at this. Tools that can be used to scale for accommodating a
large volume of information are required by NoSQL data integration,
however, manual complicated coding is required by conventional SQL
ETL tools. and also they include methods disturbing creation sources.
A database serving as a store for numerous applications is called an
integration database and therefore, data is integrated across applications.
A schema is needed by an integration database, and all applications of
clients are taken by the schema into account. Either the resultant schema
is general or complicated or both.
Here is an example for a better understanding of the integration
database. For example, the computation data of an organization is stored
in the Oracle database and client information is stored in Salesforce. The
employees can get the integrated data of the two frameworks in a single
spot with the help of database integration processes. website database
integration is used by a few organizations for managing and bringing
together information from different site pages. Database integration is
only viable with the consolidation of data from on-premise systems,
legacy systems, and cloud databases. Different software is used by each
company.

Aggregate Data Model in NoSQL

We know, NoSQL are databases that store data in another format other
than relational databases. NoSQL deals in nearly every industry
nowadays. For the people who interact with data in databases, the
Aggregate Data model will help in that interaction.
Features of NoSQL Databases:
 Schema Agnostic: NoSQL Databases do not require any specific
schema or s storage structure than traditional RDBMS.
 Scalability: NoSQL databases scale horizontally as data grows
rapidly certain commodity hardware could be added and scalability
features could be preserved for NoSQL.
 Performance: To increase the performance of the NoSQL system one
can add a different commodity server than reliable and fast access of
database transfer with minimum overhead.
 High Availability: In traditional RDBMS it relies on primary and
secondary nodes for fetching the data, Some NoSQL databases use
master place architecture.
 Global Availability: As data is replicated among multiple servers and
clouds the data is accessible to anyone, this minimizes the latency
period.

Aggregate Data Models:

The term aggregate means a collection of objects that we use to treat as

a unit. An aggregate is a collection of data that we interact with as a unit.
These units of data or aggregates form the boundaries for ACID
operation.
Example of Aggregate Data Model:
Here in the diagram have two Aggregate:
 Customer and Orders link between them represent an aggregate.
 The diamond shows how data fit into the aggregate structure.
 Customer contains a list of billing address
 Payment also contains the billing address
 The address appears three times and it is copied each time
 The domain is fit where we don’t want to change shipping and billing
address.
Consequences of Aggregate Orientation:
 Aggregation is not a logical data property It is all about how the data is
being used by applications.
 An aggregate structure may be an obstacle for others but help with
some data interactions.
 It has an important consequence for transactions.
 NoSQL databases don’t support ACID transactions thus sacrificing
consistency.
 aggregate-oriented databases support the atomic manipulation of a
single aggregate at a time.
Advantage:
 It can be used as a primary data source for online applications.
 Easy Replication.
 No single point Failure.
 It provides fast performance and horizontal Scalability.
 It can handle Structured semi-structured and unstructured data with
equal effort.
Disadvantage:
 No standard rules.
 Limited query capabilities.
 Doesn’t work well with relational data.
 Not so popular in the enterprise.
 When the value of data increases it is difficult to maintain unique
values.

Key-Value Data Model in NoSQL

A key-value data model or database is also referred to as a key-value
store. It is a non-relational type of database. In this, an associative array
is used as a basic database in which an individual key is linked with just
one value in a collection. For the values, keys are special identifiers. Any
kind of entity can be valued. The collection of key-value pairs stored on
separate records is called key-value databases and they do not have an
already defined structure.

How do key-value databases work?

A number of easy strings or even a complicated entity are referred to as a

value that is associated with a key by a key-value database, which is
utilized to monitor the entity. Like in many programming paradigms, a
key-value database resembles a map object or array, or dictionary,
however, which is put away in a tenacious manner and controlled by a
DBMS.
An efficient and compact structure of the index is used by the key-value
store to have the option to rapidly and dependably find value using its
key. For example, Redis is a key-value store used to tracklists, maps,
heaps, and primitive types (which are simple data structures) in a
constant database. Redis can uncover a very basic point of interaction to
query and manipulate value types, just by supporting a predetermined
number of value types, and when arranged, is prepared to do high
throughput.

When to use a key-value database:

Here are a few situations in which you can use a key-value database:-
 User session attributes in an online app like finance or gaming, which
is referred to as real-time random data access.
 Caching mechanism for repeatedly accessing data or key-based
design.
 The application is developed on queries that are based on keys.

Features:

 One of the most un-complex kinds of NoSQL data models.

 For storing, getting, and removing data, key-value databases utilize
simple functions.
 Querying language is not present in key-value databases.
 Built-in redundancy makes this database more reliable.

Advantages:

 It is very easy to use. Due to the simplicity of the database, data can
accept any kind, or even different kinds when required.
 Its response time is fast due to its simplicity, given that the remaining
environment near it is very much constructed and improved.
 Key-value store databases are scalable vertically as well as
horizontally.
 Built-in redundancy makes this database more reliable.

Disadvantages:

 As querying language is not present in key-value databases,

transportation of queries from one database to a different database
cannot be done.
 The key-value store database is not refined. You cannot query the
database without a key.

Some examples of key-value databases:

Here are some popular key-value databases which are widely used:
 Couchbase: It permits SQL-style querying and searching for text.
 Amazon DynamoDB: The key-value database which is mostly used is
Amazon DynamoDB as it is a trusted database used by a large
number of users. It can easily handle a large number of requests
every day and it also provides various security options.
 Riak: It is the database used to develop applications.
 Aerospike: It is an open-source and real-time database working with
billions of exchanges.
 Berkeley DB: It is a high-performance and open-source database
providing scalability.

Columnar Data Model of NoSQL

The Columnar Data Model of NoSQL is important. NoSQL databases are
different from SQL databases. This is because it uses a data model that
has a different structure than the previously followed row-and-column
table model used with relational database management systems
(RDBMS). NoSQL databases are a flexible schema model which is
designed to scale horizontally across many servers and is used in large
volumes of data.

Columnar Data Model of NoSQL :

Basically, the relational database stores data in rows and also reads the
data row by row, column store is organized as a set of columns. So if
someone wants to run analytics on a small number of columns, one can
read those columns directly without consuming memory with the
unwanted data. Columns are somehow are of the same type and gain
from more efficient compression, which makes reads faster than before.
Examples of Columnar Data Model: Cassandra and Apache Hadoop
Hbase.

Working of Columnar Data Model:

In Columnar Data Model instead of organizing information into rows, it

does in columns. This makes them function the same way that tables
work in relational databases. This type of data model is much more
flexible obviously because it is a type of NoSQL database. The below
example will help in understanding the Columnar data model:
Row-Oriented Table:

S.No. Name Course Branch ID

01. Tanmay B-Tech Computer 2

Electronic
02. Abhishek B-Tech 5
s
S.No. Name Course Branch ID

03. Samriddha B-Tech IT 7

04. Aditi B-Tech E & TC 8

Column – Oriented Table:

S.No. Name ID

01. Tanmay 2

02. Abhishek 5

03. Samriddha 7

04. Aditi 8

S.No. Course ID

01. B-Tech 2

02. B-Tech 5

03. B-Tech 7

04. B-Tech 8

S.No. Branch ID

01. Computer 2

02. Electronics 5

03. IT 7

04. E & TC 8
Columnar Data Model uses the concept of keyspace, which is like a
schema in relational models.
Advantages of Columnar Data Model :

 Well structured: Since these data models are good at compression

so these are very structured or well organized in terms of storage.
 Flexibility: A large amount of flexibility as it is not necessary for the
columns to look like each other, which means one can add new and
different columns without disrupting the whole database
 Aggregation queries are fast: The most important thing is
aggregation queries are quite fast because a majority of the
information is stored in a column. An example would be Adding up the
total number of students enrolled in one year.
 Scalability: It can be spread across large clusters of machines, even
numbering in thousands.
 Load Times: Since one can easily load a row table in a few seconds
so load times are nearly excellent.

Disadvantages of Columnar Data Model:

 Designing indexing Schema: To design an effective and working

schema is too difficult and very time-consuming.
 Suboptimal data loading: incremental data loading is suboptimal and
must be avoided, but this might not be an issue for some users.
 Security vulnerabilities: If security is one of the priorities then it must
be known that the Columnar data model lacks inbuilt security features
in this case, one must look into relational databases.
 Online Transaction Processing (OLTP): Online Transaction
Processing (OLTP) applications are also not compatible with columnar
data models because of the way data is stored.

Applications of Columnar Data Model:

 Columnar Data Model is very much used in various Blogging

Platforms.
 It is used in Content management systems like WordPress, Joomla,
etc.
 It is used in Systems that maintain counters.
 It is used in Systems that require heavy write requests.
 It is used in Services that have expiring usage.

Graph Based Data Model in NoSQL

Graph Based Data Model in NoSQL is a type of Data Model which tries to
focus on building the relationship between data elements. As the name
suggests Graph-Based Data Model, each element here is stored as a
node, and the association between these elements is often known as
Links. Association is stored directly as these are the first-class elements
of the data model. These data models give us a conceptual view of the
data.
These are the data models which are based on topographical network
structure. Obviously, in graph theory, we have terms like Nodes, edges,
and properties, let’s see what it means here in the Graph-Based data
model.
 Nodes: These are the instances of data that represent objects which
is to be tracked.
 Edges: As we already know edges represent relationships between
nodes.
 Properties: It represents information associated with nodes.
The below image represents Nodes with properties from relationships
represented by edges.

Working of Graph Data Model :

In these data models, the nodes which are connected together are
connected physically and the physical connection among them is also
taken as a piece of data. Connecting data in this way becomes easy to
query a relationship. This data model reads the relationship from storage
directly instead of calculating and querying the connection steps. Like
many different NoSQL databases these data models don’t have any
schema as it is important because schema makes the model well and
good and easy to edit.
Examples of Graph Data Models :
 JanusGraph: These are very helpful in big data analytics. It is a
scalable graph database system open source too. JanusGraph has
different features like:
 Storage: Many options are available for storing graph data
like Cassandra.
 Support for transactions: There are many supports available like
ACID (Atomicity, Consistency, Isolation, and Durability) which can
hold thousands of concurrent users.
 Searching options: Complex searching options are available and
optional support too.
 Neo4j: It stands for Network Exploration and Optimization 4 Java. As
the name suggests this graph database is written in Java with native
graph storage and processing. Neo4j has different features like:
 Scalable: Scalable through data partitioning into pieces known as
shards.
 Higher Availability: Availability is very much high due to continuous
backups and rolling upgrades.
 Query Language: Uses programmer-friendly query language Cypher
graph query language.DGraph main features are:
 DGraph: It is an open-source distributed graph database system
designed with scalability.
 Query Language: It uses GraphQL, which is solely made for APIs.
 open-source system: support for many open standards.

Advantages of Graph Data Model :

 Structure: The structures are very agile and workable too.

 Explicit Representation: The portrayal of relationships between
entities is explicit.
 Real-time O/P Results: Query gives us real-time output results.

Disadvantages of Graph Data Model :

 No standard query language: Since the language depends on the

platform that is used so there is no certain standard query language.
 Unprofessional Graphs: Graphs are very unprofessional for
transactional-based systems.
 Small User Base: The user base is small which makes it very difficult
to get support when running into a system.

Applications of Graph Data Model:

 Graph data models are very much used in fraud detection which itself
is very much useful and important.
 It is used in Digital asset management which provides a scalable
database model to keep track of digital assets.
 It is used in Network management which alerts a network
administrator about problems in a network.
 It is used in Context-aware services by giving traffic updates and
many more.
 It is used in Real-Time Recommendation Engines which provide a
better user experience.

References
1. https://www.geeksforgeeks.org/columnar-data-model-of-nosql/
2. https://acs.dypvp.edu.in/NAAC/Database-Technology.pdf
3. https://www.studocu.com/in/document/visvesvaraya-
technological-university/nosql-databases/module-1-notes-nosql/
29646579
4.

Learnmeabitcoin Technical Keys Address
No ratings yet
Learnmeabitcoin Technical Keys Address
7 pages
Unit 5
No ratings yet
Unit 5
27 pages
DBMS - Unit 1
No ratings yet
DBMS - Unit 1
33 pages
Cassandra: Types of Nosql Databases
No ratings yet
Cassandra: Types of Nosql Databases
6 pages
BDA Unit2 Complete
No ratings yet
BDA Unit2 Complete
56 pages
4.2 NoSQL Databases UNIT-1
No ratings yet
4.2 NoSQL Databases UNIT-1
35 pages
Nosql Module 2
100% (1)
Nosql Module 2
87 pages
AI Chatbot Unit 2
No ratings yet
AI Chatbot Unit 2
7 pages
Module 4 Nosql
No ratings yet
Module 4 Nosql
8 pages
Nosql Databases: by Amy Alexander and Tanya Christina
No ratings yet
Nosql Databases: by Amy Alexander and Tanya Christina
14 pages
Hbase
No ratings yet
Hbase
13 pages
Unit 2 BDA
No ratings yet
Unit 2 BDA
32 pages
Nosqlmodule 1
100% (1)
Nosqlmodule 1
102 pages
Characteristics of Key Value DB (DB)
No ratings yet
Characteristics of Key Value DB (DB)
13 pages
NoSQL Notes
No ratings yet
NoSQL Notes
5 pages
P.prabu (29x61c) CCS334 BDA - Unit 2
No ratings yet
P.prabu (29x61c) CCS334 BDA - Unit 2
29 pages
Unit 5-Key - Value Store Database
No ratings yet
Unit 5-Key - Value Store Database
16 pages
IAT-I Question Paper With Solution of 18CS823 Nosql Database May-2021-Poonam Tijare
100% (1)
IAT-I Question Paper With Solution of 18CS823 Nosql Database May-2021-Poonam Tijare
12 pages
Database Management System: Important Questions Unit-1
No ratings yet
Database Management System: Important Questions Unit-1
9 pages
21CS745 Model Set 1 Paper
No ratings yet
21CS745 Model Set 1 Paper
2 pages
Introduction To Nosql: - Key Value Databases
No ratings yet
Introduction To Nosql: - Key Value Databases
14 pages
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
No ratings yet
Database Management Systems: ©silberschatz, Korth and Sudarshan 1.1 Database System Concepts
33 pages
No SQL
No ratings yet
No SQL
11 pages
Module 6 Lecture 1 (Advance Topics)
No ratings yet
Module 6 Lecture 1 (Advance Topics)
18 pages
Nosql Databases Unit-2
0% (1)
Nosql Databases Unit-2
15 pages
Unit 4 BDA
No ratings yet
Unit 4 BDA
31 pages
DBMS Unit 1 Notes
100% (1)
DBMS Unit 1 Notes
22 pages
PPL I-GGoyal U2.1 Structured - Data - Objects 2022-11-18 20 - 07 Office Lens
100% (1)
PPL I-GGoyal U2.1 Structured - Data - Objects 2022-11-18 20 - 07 Office Lens
49 pages
DBMS (UNIT-6) (Advances in Databases and Big Data)
No ratings yet
DBMS (UNIT-6) (Advances in Databases and Big Data)
103 pages
Case Study: Using MongoDB For An E-Commerce Platform
100% (8)
Case Study: Using MongoDB For An E-Commerce Platform
32 pages
Notes - SN
No ratings yet
Notes - SN
5 pages
MST Unit 5
No ratings yet
MST Unit 5
6 pages
Unit-5 Notes
No ratings yet
Unit-5 Notes
17 pages
Data Modeling ER Model Concept: Component of ER Diagram
No ratings yet
Data Modeling ER Model Concept: Component of ER Diagram
21 pages
HBase
No ratings yet
HBase
36 pages
Nosql Databases Unit-1
No ratings yet
Nosql Databases Unit-1
16 pages
InfyTQ DBMS Lecture Session 1
No ratings yet
InfyTQ DBMS Lecture Session 1
35 pages
DBMS Unit 4
No ratings yet
DBMS Unit 4
20 pages
Disconnected Architecture in Ado
No ratings yet
Disconnected Architecture in Ado
12 pages
Unit 5 - SE - Notes
No ratings yet
Unit 5 - SE - Notes
45 pages
Mean Stack Technologies Lab Record
No ratings yet
Mean Stack Technologies Lab Record
49 pages
Integrity and Domain Constraints
No ratings yet
Integrity and Domain Constraints
25 pages
Unit 3
No ratings yet
Unit 3
28 pages
Mc5502 Bda Unit I Notes
No ratings yet
Mc5502 Bda Unit I Notes
106 pages
DBMS Unit-4 Notes
No ratings yet
DBMS Unit-4 Notes
62 pages
Lecture 04-Hash Pointers and Data Structures
No ratings yet
Lecture 04-Hash Pointers and Data Structures
22 pages
NOSQL
No ratings yet
NOSQL
16 pages
Bda - Unit 2
No ratings yet
Bda - Unit 2
30 pages
NoSQL Module 2
No ratings yet
NoSQL Module 2
76 pages
Unit-2 Introduction To Data Mining
100% (1)
Unit-2 Introduction To Data Mining
11 pages
Unit-4-Database Security
No ratings yet
Unit-4-Database Security
14 pages
AAA Administration For Clouds Unit 4
No ratings yet
AAA Administration For Clouds Unit 4
35 pages
Transaction Management Unit III
No ratings yet
Transaction Management Unit III
28 pages
CCS334 BIG DATA ANALYTICS Session 1 Intr
No ratings yet
CCS334 BIG DATA ANALYTICS Session 1 Intr
18 pages
The CAP Theorem
100% (1)
The CAP Theorem
3 pages
NoSQL Intro
No ratings yet
NoSQL Intro
26 pages
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
100% (1)
What Is Indexing?: Indexing Is A Data Structure Technique Which Allows You To Quickly Retrieve
7 pages
Presentation ON Neo4J
No ratings yet
Presentation ON Neo4J
5 pages
More Details On Data Models
No ratings yet
More Details On Data Models
23 pages
Practical No: 2B Perform The Extraction Transformation and Loading (ETL) Process To Construct The Database in The SQL Server
No ratings yet
Practical No: 2B Perform The Extraction Transformation and Loading (ETL) Process To Construct The Database in The SQL Server
13 pages
NOSQL Database
No ratings yet
NOSQL Database
10 pages
Log
No ratings yet
Log
3 pages
Module 58
100% (1)
Module 58
3 pages
ITW163A StorageCompatibilityAndInnovation
No ratings yet
ITW163A StorageCompatibilityAndInnovation
20 pages
Email Security Administrators Guide
No ratings yet
Email Security Administrators Guide
359 pages
Computer Exam Questions JSS3
No ratings yet
Computer Exam Questions JSS3
3 pages
OOP Programs 2025
No ratings yet
OOP Programs 2025
39 pages
BSAIS-ISM 323 Information Sheet 4 PDF
No ratings yet
BSAIS-ISM 323 Information Sheet 4 PDF
17 pages
Mohamed Ibrahim - CV
No ratings yet
Mohamed Ibrahim - CV
2 pages
CL210 Rhosp16
No ratings yet
CL210 Rhosp16
6 pages
Setstate & Initstate
No ratings yet
Setstate & Initstate
2 pages
Receiving Only Custom Role 19D
No ratings yet
Receiving Only Custom Role 19D
6 pages
Introducing LockRattler
No ratings yet
Introducing LockRattler
14 pages
Rohith NSE Resume
No ratings yet
Rohith NSE Resume
5 pages
O RAN - Wg1.slicing Architecture v05.00
No ratings yet
O RAN - Wg1.slicing Architecture v05.00
53 pages
Cao - Unit 2 - Notes Full
No ratings yet
Cao - Unit 2 - Notes Full
49 pages
COMM5007 Summer 2025 Lec01 v2
No ratings yet
COMM5007 Summer 2025 Lec01 v2
56 pages
Ad3301 Data Exploration and Visualization
100% (3)
Ad3301 Data Exploration and Visualization
30 pages
Understanding What Are Clock Gating Checks and How To Specify, Report and Disable These Checks
No ratings yet
Understanding What Are Clock Gating Checks and How To Specify, Report and Disable These Checks
7 pages
PW3335 (-01,-02,-03,-04) Power Meter: Communication Command Instruction Manual
No ratings yet
PW3335 (-01,-02,-03,-04) Power Meter: Communication Command Instruction Manual
123 pages
DF Paper
No ratings yet
DF Paper
4 pages
4 Securing Future For Iiot
No ratings yet
4 Securing Future For Iiot
8 pages
Ict Project
No ratings yet
Ict Project
16 pages
Assignment No: 7 Title: Kernel Space Programming: What Is A Loadable Kernel Module?
No ratings yet
Assignment No: 7 Title: Kernel Space Programming: What Is A Loadable Kernel Module?
5 pages
Novatel Oem7500 Datsheet
No ratings yet
Novatel Oem7500 Datsheet
2 pages
Oca Java Se 8 Exam Chapter 2 Operators Statements
No ratings yet
Oca Java Se 8 Exam Chapter 2 Operators Statements
63 pages
Akko World Tour Extended Manual 3068 3084 1.2
No ratings yet
Akko World Tour Extended Manual 3068 3084 1.2
15 pages
Doosan DMS3.pdf Codes
No ratings yet
Doosan DMS3.pdf Codes
2 pages
Logs 07092020-CORE-NEXUS
No ratings yet
Logs 07092020-CORE-NEXUS
49 pages
Ooad Unit-6
No ratings yet
Ooad Unit-6
29 pages

Unit 1 Notes in NoSQL

Uploaded by

Unit 1 Notes in NoSQL

Uploaded by

NoSQL Databases

Although the term "NoSQL" is barely a year old, in reality, most of

Of course, database volumes have grown continuously since the

Cloud computing also has placed new challenges on the database.

Google, Facebook, Amazon and other huge web sites, therefore,

Within these four NoSQL families are at least a dozen database

NoSQL is a fairly imprecise term - it defines what the databases are

The value of Relational Databases

NoSQL databases are generally classified into four main

1. Document databases: These databases store data as semi-

Integration Databases in NoSQL

Some popular integration databases in NoSQL include:

1. Apache Cassandra: A distributed database that is designed for

Nowadays, an enormous amount of information is been made each

Aggregate Data Model in NoSQL

Aggregate Data Models:

The term aggregate means a collection of objects that we use to treat as

Key-Value Data Model in NoSQL

How do key-value databases work?

A number of easy strings or even a complicated entity are referred to as a

When to use a key-value database:

 One of the most un-complex kinds of NoSQL data models.

 As querying language is not present in key-value databases,

Some examples of key-value databases:

Columnar Data Model of NoSQL

Columnar Data Model of NoSQL :

Working of Columnar Data Model:

In Columnar Data Model instead of organizing information into rows, it

S.No. Name Course Branch ID

01. Tanmay B-Tech Computer 2

03. Samriddha B-Tech IT 7

04. Aditi B-Tech E & TC 8

 Well structured: Since these data models are good at compression

Disadvantages of Columnar Data Model:

 Designing indexing Schema: To design an effective and working

Applications of Columnar Data Model:

 Columnar Data Model is very much used in various Blogging

Graph Based Data Model in NoSQL

Working of Graph Data Model :

Advantages of Graph Data Model :

 Structure: The structures are very agile and workable too.

Disadvantages of Graph Data Model :

 No standard query language: Since the language depends on the

Applications of Graph Data Model:

You might also like