0% found this document useful (0 votes)
11 views4 pages

NoSQL Databases and Big Data Storage Systems

NoSQL databases are designed for scalability, availability, and flexibility, making them suitable for unstructured and distributed data, particularly in Big Data applications. The CAP Theorem highlights that these systems can only guarantee two of three properties: consistency, availability, and partition tolerance. Various types of NoSQL databases, including document stores like MongoDB, key-value stores like Redis, wide column stores like Cassandra, and graph databases like Neo4j, each have unique structures and use cases.

Uploaded by

aaliyakaunain98
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views4 pages

NoSQL Databases and Big Data Storage Systems

NoSQL databases are designed for scalability, availability, and flexibility, making them suitable for unstructured and distributed data, particularly in Big Data applications. The CAP Theorem highlights that these systems can only guarantee two of three properties: consistency, availability, and partition tolerance. Various types of NoSQL databases, including document stores like MongoDB, key-value stores like Redis, wide column stores like Cassandra, and graph databases like Neo4j, each have unique structures and use cases.

Uploaded by

aaliyakaunain98
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

NoSQL Databases and Big Data Storage Systems

1. Introduction to NoSQL Systems

What is NoSQL?

NoSQL (Not Only SQL) is a category of database management systems that provides a mechanism for
storage and retrieval of data model in means other than the tabular relations used in relational
databases.

Why NoSQL?

 Designed for scalability, availability, and flexibility.

 Better suited for unstructured, semi-structured, and distributed data.

 Ideal for Big Data and real-time web applications.

Characteristics:

 Schema-less or flexible schema.

 High availability and partition tolerance.

 Support for horizontal scaling.

 Optimized for specific access patterns.

2. The CAP Theorem

Definition

CAP Theorem, proposed by Eric Brewer, states that a distributed database system can only guarantee
two out of three of the following:

1. Consistency (C) – Every read receives the most recent write or an error.

2. Availability (A) – Every request receives a (non-error) response, without guaranteeing the
most recent write.

3. Partition Tolerance (P) – The system continues to operate despite arbitrary partitioning due to
network failures.

Implication for NoSQL:

 NoSQL databases often sacrifice consistency to achieve availability and partition tolerance
(e.g., eventual consistency).

3. Document-Based NoSQL Systems and MongoDB

Document Stores

 Store data in the form of documents, typically JSON, BSON, or XML.

 Each document is self-describing, containing fields and values of various types.


MongoDB

 A popular open-source, document-oriented NoSQL database.

 Uses BSON (Binary JSON) format.

 Supports powerful querying, indexing, and aggregation features.

Key Features:

 Schema flexibility.

 Sharding for horizontal scaling.

 Replication for high availability.

 Supports secondary indexes, text search, and geospatial queries.

4. NoSQL Key-Value Stores

Overview

 Store data as a collection of key-value pairs.

 The key is unique, and the value can be a string, JSON, binary, etc.

 Optimized for quick lookups using keys.

Popular Examples:

 Redis: In-memory key-value store; supports rich data structures like lists, sets.

 Riak: Distributed, highly available key-value store.

 Amazon DynamoDB: Scalable, managed NoSQL key-value database service by AWS.

Use Cases:

 Caching

 Session management

 Real-time recommendation engines

5. Column-Based or Wide Column NoSQL Systems

Overview

 Store data in columns rather than rows.

 Data is grouped into column families, each containing multiple rows with flexible columns.

Key Features:

 High write throughput.

 Efficient storage for sparse data.

 Supports massive horizontal scaling.

Popular Examples:
 Apache Cassandra: Highly scalable, decentralized wide-column store.

 HBase: Built on top of Hadoop HDFS; suitable for real-time read/write access to large
datasets.

Use Cases:

 Time-series data

 Event logging

 Real-time analytics

6. NoSQL Graph Databases and Neo4j

Overview

 Store data as nodes (entities) and edges (relationships).

 Best suited for applications where relationships between data are crucial.

Graph Model:

 Nodes: Represent entities (e.g., people, products).

 Edges: Represent relationships (e.g., "FRIEND", "PURCHASED").

Neo4j

 A leading open-source graph database.

 Uses Cypher Query Language.

 ACID-compliant and supports native graph processing.

Use Cases:

 Social networks

 Fraud detection

 Recommendation systems

 Network topology analysis

Summary Comparison Table

Type Structure Best For Examples

JSON/BSON/XML
Document Store Semi-structured data MongoDB, Couchbase
documents

Key-Value Store Key-value pairs Fast lookups, caching Redis, DynamoDB, Riak

Wide Column High write volume, time-


Column families Cassandra, HBase
Store series

Neo4j, Amazon
Graph Database Nodes and relationships Relationship-heavy data
Neptune

You might also like