Unit 4-1
Unit 4-1
Unit 4-1
Introduction to NoSQL
NoSQL is a type of database management system (DBMS) that is designed to
handle and store large volumes of unstructured and semi-structured data.
Unlike traditional relational databases that use tables with
pre-defined schemas to store data, NoSQL databases use flexible data
models that can adapt to changes in data structures and are capable of
scaling horizontally to handle growing amounts of data.
The term NoSQL
originally referred to “non-SQL” or “non-relational” databases, but the
term has since evolved to mean “not only SQL,” as NoSQL databases have
expanded to include a wide range of different database architectures and
data models.
2. Key-value stores: These databases store data as key-value pairs, and are
optimized for simple and fast read/write operations.
4. Graph databases: These databases store data as nodes and edges, and are
designed to handle complex relationships between data.
NoSQL databases are often used in applications where there is a high volume of
data that
needs to be processed and analyzed in real-time, such as social media
UNIT 4 1
analytics, e-commerce, and gaming. They can also be used for other
applications, such as content management systems, document management,
and customer relationship management.
However, NoSQL databases may not be suitable for all applications, as they may
not provide the
same level of data consistency and transactional guarantees as
traditional relational databases. It is important to carefully evaluate
the specific needs of an application when choosing a database management
system.
NoSQL systems are also sometimes called Not only SQL to emphasize the fact
that they may support SQL-like query languages. A NoSQL database includes
simplicity of design, simpler horizontal
scaling to clusters of machines,
has and finer control over availability. The data structures used by NoSQL
databases are different from those used by default in relational
databases which makes some operations faster in NoSQL. The suitability
of a given NoSQL database depends on the problem it should solve.
NoSQL databases, also known as “not only SQL” databases, are a new type of
database management system that has, gained popularity in recent years.
Unlike traditional relational
databases, NoSQL databases are designed to handle large amounts of
unstructured or semi-structured data, and they can accommodate dynamic
changes to the data model. This makes NoSQL databases a good fit for
modern web applications, real-time analytics, and big data processing.
UNIT 4 2
Many NoSQL stores compromise consistency in favor of availability,
speed,
, and partition tolerance.
Barriers to the greater adoption of NoSQL stores include the use of
low-level query languages, lack of standardized interfaces, and huge
previous investments in existing relational databases.
UNIT 4 3
model. This makes NoSQL databases a good fit for modern web
applications, real-time analytics, and big data processing.
2. Horizontal scalability:
NoSQL databases are designed to scale out by adding more nodes to a
database cluster, making them well-suited for handling large amounts of
data and high levels of traffic.
8. Performance: NoSQL databases are optimized for high performance and can
handle a high volume of
reads and writes, making them suitable for big data and real-time
applications.
Advantages of NoSQL:
There are many advantages of working with NoSQL databases such as
UNIT 4 4
MongoDB and Cassandra. The main advantages are high scalability and high
availability.
4. Scalability: NoSQL databases are highly scalable, which means that they can
handle large amounts of
data and traffic with ease. This makes them a good fit for applications
that need to handle large amounts of data or traffic
6. Cost-effectiveness: NoSQL
databases are often more cost-effective than traditional relational
UNIT 4 5
databases, as they are typically less complex and do not require
expensive hardware or software.
5. Lack of support for complex queries: NoSQL databases are not designed to
handle complex queries, which means that
they are not a good fit for applications that require complex data
analysis or reporting.
UNIT 4 6
particular, has a reputation for being challenging to install and even
more hectic to manage on a daily basis.
8. GUI is not available: GUI mode tools to access the database are not flexibly
available in the market.
Types of NoSQL database: Types of NoSQL databases and the name of the
database system that falls in that category are:
2. The relationship between the data you store is not that important
5. The data is growing continuously and you need to scale the database regularly
to handle the data.
In conclusion, NoSQL
databases offer several benefits over traditional relational databases,
such as scalability, flexibility, and cost-effectiveness. However, they
also have several drawbacks, such as a lack of standardization, lack of
ACID compliance, and lack of support for complex queries. When choosing a
UNIT 4 7
database for a specific application, it is important to weigh the
benefits and drawbacks carefully to determine the best fit.
1. Scalability:
2. Performance:
High Throughput and Low Latency: NoSQL databases are optimized for
fast read and write operations, making them suitable for real-time
applications and high-speed data processing tasks.
UNIT 4 8
Handling Variety: NoSQL databases excel at handling diverse data types,
including unstructured and semi-structured data such as documents,
JSON, XML, and multimedia files.
Support for Big Data Workloads: NoSQL databases are well-suited for big
data analytics and processing tasks, enabling organizations to analyze
large volumes of data in real-time and derive valuable insights.
6. Cost Efficiency:
UNIT 4 9
Real-Time Analytics: NoSQL databases are used for real-time analytics
applications such as fraud detection, recommendation engines, and
personalized content delivery.
IoT and Sensor Data: NoSQL databases are ideal for handling large
volumes of time-series data generated by IoT devices, sensors, and
machine logs.
Overall, the adoption of NoSQL databases is driven by the need for scalable,
flexible, and high-performance data management solutions that can meet the
demands of modern businesses in the digital age. By embracing NoSQL
technology, organizations can gain a competitive edge, unlock new opportunities,
and deliver innovative products and services to their customers.
UNIT 4 10
This model is one of the most basic models of NoSQL databases. As the name
suggests, the
data is stored in form of Key-Value Pairs. The key is usually a sequence
of strings, integers or characters but can also be a more advanced data
type. The value is typically linked or co-related to the key. The
key-value pair storage databases generally store data as a hash table
where each key is unique. The value can be of any type (JSON,
BLOB(Binary Large Object), strings, etc). This type of pattern is
usually used in shopping websites or e-commerce applications.
Advantages:
Limitations:
Complex queries may attempt to involve multiple key-value pairs which may
delay performance.
Examples:
DynamoDB
Berkeley DB
UNIT 4 11
Rather than storing
data in relational tuples, the data is stored in individual cells which
are further grouped into columns. Column-oriented databases work only on
columns. They store large amounts of data into columns together. Format
and titles of the columns can diverge from one row to other. Every
column is treated separately. But still, each individual column may
contain multiple other columns like traditional databases.
Basically, columns are mode of storage in this type.
Advantages:
Examples:
HBase
Bigtable by Google
Cassandra
3. Document Database:
UNIT 4 12
The document database
fetches and accumulates data in form of key-value pairs but here, the
values are called as Documents. Document can be stated as a complex data
structure. Document here can be a form of text, arrays, strings, JSON,
XML or any such format. The use of nested documents is also very common.
It is very effective as most of the data created is usually in form of
JSONs and is unstructured.
Advantages:
This type of format is very useful and apt for semi-structured data.
Limitations:
Examples:
MongoDB
CouchDB
UNIT 4 13
4. Graph Databases:
Clearly, this architecture
pattern deals with the storage and management of data in graphs. Graphs
are basically structures that depict connections between two or more
objects in some data. The objects or entities are called as nodes and
are joined together by relationships called Edges. Each edge has a
unique identifier. Each node serves as a point of contact for the graph.
This pattern is very commonly used in social networks where there are a
large number of entities and each entity has one or many
characteristics which are connected by edges. The relational database
pattern has tables that are loosely connected, whereas graphs are often
very strong and rigid in nature.
Advantages:
Limitations:
Wrong connections may lead to infinite loops.
Examples:
Neo4J
UNIT 4 14
Figure – Graph model format of NoSQL Databases
Description: Wide column stores store data in columns rather than rows,
enabling efficient retrieval of columns across multiple rows.
6. Time-Series Databases:
Use Case: Ideal for storing and analyzing time-series data generated by
sensors, devices, and applications.
7. Multimodel Databases:
UNIT 4 15
Use Case: Useful for applications requiring a combination of key-value,
document, graph, and relational data models.
8. NewSQL Databases:
Each of these NoSQL architectural patterns offers distinct advantages and trade-
offs, allowing organizations to choose the most suitable approach based on their
specific requirements, such as scalability, performance, flexibility, and
consistency. By leveraging these patterns, organizations can effectively manage
and derive insights from big data while optimizing resource utilization and
ensuring high availability and fault tolerance.
MongoDB: An introduction
MongoDB, the most popular NoSQL database, is an open-source document-
oriented
database. The term ‘NoSQL’ means ‘non-relational’. It means that MongoDB
isn’t based on the table-like relational database structure but
provides an altogether different mechanism for storage and retrieval of
data. This format of storage is called BSON ( similar to JSON format).
{
title: 'Geeksforgeeks',
by: 'Harshit Gupta',
url: 'https://www.geeksforgeeks.org',
UNIT 4 16
type: 'NoSQL'
}
RDBMS vs MongoDB:
RDBMS has a typical schema design that shows number of tables and
the relationship between these tables whereas MongoDB is
document-oriented. There is no concept of schema or relationship.
There are a few terms that are related in both databases. What’s
called Table in RDBMS is called a Collection in MongoDB. Similarly, a
Row is called a Document and a Column is called a Field. MongoDB
provides a default ‘_id’ (if not provided explicitly) which is a 12-byte
UNIT 4 17
hexadecimal number that assures the uniqueness of every document. It is
similar to the Primary key in RDBMS.
Features of MongoDB:
Document Oriented: MongoDB stores the main subject
in the minimal number of documents and not by breaking it up into
multiple relational structures like RDBMS. For example, it stores all
the information of a computer in a single document called Computer and
not in distinct relational structures like CPU, RAM, Hard disk, etc.
UNIT 4 18
built-in solution for partitioning and sharding your database.
Advantages of MongoDB
MongoDB offers several potential benefits:
UNIT 4 19
data across a cluster of machines. MongoDB also supports the creation of
zones of data based on a shard key.
Aggregation. The DBMS also has built-in aggregation capabilities, which lets
users run MapReduce code directly on the database rather than running
MapReduce on Hadoop. MongoDB also includes its own file system called
GridFS, akin to the Hadoop Distributed File System. The use of the file system
is primarily for storing files larger than
BSON's size limit of 16 MB per document. These similarities let MongoDB
be used instead of Hadoop, though the database software does integrate
with Hadoop,
Spark and other data processing frameworks.
Disadvantages of MongoDB
Though there are some valuable benefits to MongoDB, there are some downsides
to it as well.
Continuity. With its automatic failover strategy, a user sets up just one master
node in a MongoDB cluster. If the master fails,
another node will automatically convert to the new master. This switch
promises continuity, but it isn't instantaneous -- it can take up to a
minute. By comparison, the
Cassandra NoSQL database supports multiple master nodes. If one master
goes down, another is
standing by, creating a highly available database infrastructure.
UNIT 4 20
Security. In addition, user authentication isn't enabled by default in MongoDB
databases. However, malicious hackers have targeted large numbers of
unsecured MongoDB systems in attacks, which led to the addition of a default
setting that blocks networked connections to
databases if they haven't been configured by a database administrator.
UNIT 4 21