Module 5 - Nosql
Module 5 - Nosql
By,
Dr. Bhargava R
SQL vs No SQL
SQL NO SQL
Relational DB Distributed DB
Defined Schema Dynamic Schema
Vertical Scalable Horizontal Scalable
Low Availability Highly Available
Support Complex Queries Not Supported for Complex Queries
CAP Theorem
• Consistency means that the nodes will have the same copies of
a replicated data item visible for various transactions.
• A guarantee that every node in a distributed cluster returns the
same, most recent and a successful write.
• Consistency refers to every client having the same view of the
data.
Availability
Features:
• One of the most un-complex kinds of NoSQL data models.
• For storing, getting, and removing data, key-value databases utilize simple functions.
• Querying language is not present in key-value databases.
• Built-in redundancy makes this database more reliable.
Key-Value Store
Advantages:
• It is very easy to use.
• Its response time is fast.
• Key-value store databases are scalable vertically as well as horizontally.
• Built-in redundancy makes this database more reliable.
Disadvantages:
• As querying language is not present in key-value databases, transportation of
queries from one database to a different database cannot be done.
• The key-value store database is not refined. You cannot query the database
without a key.
Document-Based
• A Document Data Model is a lot
different than other data models
because it stores data in JSON,
BSON, or XML documents.
• It works as a semi-structured
data model in which the records
and data associated with them are
stored in a single document
which means this data model is
not completely unstructured.
• The main thing is that data here is
stored in a document.
Document-Based
Features:
• Document Type Model: As we all know data is stored in documents rather than tables or graphs,
so it becomes easy to map things in many programming languages.
• Flexible Schema: Overall schema is very much flexible to support this statement one must know
that not all documents in a collection need to have the same fields.
• Distributed and Resilient: Document data models are very much dispersed which is the reason
behind horizontal scaling and distribution of data.
• Manageable Query Language: These data models are the ones in which query language allows
the developers to perform CRUD (Create Read Update Destroy) operations on the data model.
Applications of Document Data Model :
• Content Management: These data models are very much used in creating various video streaming
platforms, blogs, and similar services Because each is stored as a single document and the database
here is much easier to maintain as the service evolves over time.
• Book Database: These are very much useful in making book databases because as we know this
data model lets us nest.
• Catalog: When it comes to storing and reading catalog files these data models are very much used
because it has a fast reading ability if incase Catalogs have thousands of attributes stored.
• Analytics Platform: These data models are very much used in the Analytics Platform.
Document-Based
Advantages:
• Schema-less:
• Faster creation of document and maintenance:
• Open formats:
• Built-in versioning:
Disadvantages:
• Weak Atomicity: It lacks in supporting multi-document ACID transactions. A change in the
document data model involving two collections will require us to run two separate queries i.e. one
for each collection. This is where it breaks atomicity requirements.
• Consistency Check Limitations: One can search the collections and documents that are not
connected to an author collection but doing this might create a problem in the performance of
database performance.
• Security: Nowadays many web applications lack security which in turn results in the leakage of
sensitive data. So it becomes a point of concern, one must pay attention to web app vulnerabilities.
Column-Based
• Basically, the relational database stores data in rows and also reads the data row
by row, column store is organized as a set of columns.
• So if someone wants to run analytics on a small number of columns, one can read
those columns directly without consuming memory with the unwanted data.
• Columns are somehow are of the same type and gain from more efficient
compression, which makes reads faster than before.
• Examples of Columnar Data Model: Cassandra and Apache Hadoop Hbase.
• In Columnar Data Model instead of organizing information into rows, it does in
columns. This makes them function the same way that tables work in relational
databases.
Column-Based
Column-Based
Column-Based
Advantages of Columnar Data Model :
• Well structured:
• Flexibility:
• Aggregation queries are fast:
• Scalability:
• Load Times:
Disadvantages of Columnar Data Model:
• Designing indexing Schema: To design an effective and working schema is too difficult and very time-
consuming.
• Suboptimal data loading: incremental data loading is suboptimal and must be avoided, but this might
not be an issue for some users.
• Security vulnerabilities: If security is one of the priorities then it must be known that the Columnar
data model lacks inbuilt security features in this case, one must look into relational databases.
• Online Transaction Processing (OLTP): Online Transaction Processing (OLTP) applications are also
not compatible with columnar data models because of the way data is stored.
Column-Based
Applications of Columnar Data Model: