0% found this document useful (0 votes)
7 views

Session 8 - NoSQL

The document provides an overview of various types of NoSQL databases, including key-value stores, columnar databases, document databases, and graph databases, highlighting their structures, advantages, and limitations. Key-value stores use unique keys to associate with values, while columnar databases optimize data retrieval by storing data in columns. Document databases allow for semi-structured data storage, and graph databases focus on representing relationships as graphs, each with specific use cases and challenges.

Uploaded by

Aanshika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Session 8 - NoSQL

The document provides an overview of various types of NoSQL databases, including key-value stores, columnar databases, document databases, and graph databases, highlighting their structures, advantages, and limitations. Key-value stores use unique keys to associate with values, while columnar databases optimize data retrieval by storing data in columns. Document databases allow for semi-structured data storage, and graph databases focus on representing relationships as graphs, each with specific use cases and challenges.

Uploaded by

Aanshika
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

NoSQL Database

Key-value store Database

• A key-value store database is a type of NoSQL database that uses a simple key-value
pair to store data.
• Data is stored as a unique key associated with a specific value.
• The key is a unique identifier (e.g., string, number).
• The value can be any type of data (e.g., text, JSON, binary).
• Popular examples include Redis, DynamoDB, Riak, and Memcached.
Key-value store Database

• Advantages:
• Optimized for fast reads and writes due to its simple structure.
• Highly scalable horizontally for handling large amounts of data.
• Allows dynamic data schema as the value can hold different types of data.
• Limitations
• No complex querying or filtering compared to relational databases.
• Not ideal for highly structured or relational data.
Columnar Database

• A columnar database or column-oriented database is a type of database management


system that stores data tables by column rather than by row.
• In columnar databases, when a query is executed, only the columns that are needed
are read from disk.
• It reduces the amount of data that needs to be loaded into memory and improve the
performance.
• Columnar databases are ideal for running aggregates, sums, averages etc on large
datasets.
• These are also commonly used in data warehousing applications which involve large
volumes of data and frequent aggregate queries.
4. s
Examples

• Google BigQuery, it uses the Dremel columnar storage engine. Also provides
distributed SQL query engine.
• Amazon Redshift is a columnar storage and MPP architecture.
• Snowflake, it is built on Amazon S3 and virtual warehouses in EC2. Columnar
storage across clusters.
• Cassandra, it is a NoSQL database with option of storing tables as columns.
Benefits of Columnar Databases

• High compression ratios of 10x or more reduces storage costs.


• Accessing only necessary columns makes queries much faster.
• Single operations on column vectors leverages CPU parallelism.
• Perfect fit for analytics and BI workloads with aggregations across huge datasets.
• Scale out using distributed storage and parallel execution.
Limitations of Columnar Databases

• Modifying and deleting data is slower and more complex.


• Lack of row-level locking makes transactions difficult.
• Added complexity for inserts, deletes and locking mechanisms.
• Joins on unsorted data can be slow.
Document Database

• Also referred to as document-oriented database, a document store allows the


inserting, retrieving, and manipulating of semi-structured data.
• Most of the databases available under this category use XML, JSON, BSON, or
YAML, with data access typically over HTTP protocol using APIs.
• Compared to RDBMS, the documents themselves act as records (or rows), however,
it is semi-structured as compared to rigid RDBMS.
• For example, two records may have completely different set of fields or columns.
• The records may or may not adhere to a specific schema (like the table definitions in
RDBMS). The database may not support a schema or validating a document against
the schema at all.
eXtensible Markup Language (XML)

• Used to exchange data across the web

<person>
<name>Chuck</name>
<phone type="intl">
+1 734 303 4456
</phone>
<email hide="yes" />
</person>

• Elements/Nodes - person
• Nested Elements – name, phone, email
• Value - +1 734 303 4456
• Attribute – hide
JavaScript Object Notation - JSON

• Used to exchange data across the web

{
"name" : "Chuck",
"phone" : {
"type" : "intl",
"number" : "+1 734 303 4456"},
"email" : {
"hide" : "yes"
}
}
Benefits of Document Databases

• Schema flexibility for dynamic or evolving data structures.


• High scalability for large datasets and distributed systems.
• Easy integration with modern applications and APIs.
• Supports hierarchical and nested data natively.
• Simplifies querying with rich, document-oriented query languages.
Limitations of Document Databases

• Not ideal for complex transactions involving multiple entities.


• Can lead to data duplication if relationships aren't embedded properly.
• Limited support for advanced join operations compared to relational databases.
• Schema flexibility may lead to inconsistent data structures.
• Limited standardization across different document database implementations.
Graph Database

• Graph databases represent a special category of NoSQL databases where


relationships are represented as graphs.
• There can be multiple links between two nodes in a graph—representing the multiple
relationships that the two nodes share.
• The relationships represented may include social relationships between people,
transport links between places, or network topologies between connected systems.
Examples

• Neo4j
• FlockDB (from Twitter)
Benefits of Graph Databases

• High Performance: Efficiently handles complex queries on large datasets.


• Flexible Schema: Adapts easily to evolving data structures.
• Natural Data Modeling: Represents relationships intuitively using nodes and edges.
• Fast Traversals: Quickly navigates connected data with minimal joins.
• Real-Time Insights: Ideal for real-time analytics and recommendations.
Limitations of Graph Databases

• Steep Learning Curve: Requires specialized knowledge and skills.Limited


• Not ideal for highly transactional or tabular data.
• Scalability Challenges
• Fewer tools and ecosystem support compared to relational databases.
• Many graph databases are proprietary, limiting flexibility.

You might also like