NoSql Module 1 Part2
NoSql Module 1 Part2
NoSQL 24-06-2022
More Details on Data Models
Relationships
The connections between entities in a data model are called relationships, and relationships
reflect business rules.
Relationships between entities can be one-to-one, one-to-many, or many-to-many. The
relationship between products and vendors can illustrate a one-to-many relationship
NoSQL databases can store relationship data — they just store it differently than relational
databases do.
In fact, when compared with relational databases, many find modeling relationship data in
NoSQL databases to be easier than in relational databases, because related data doesn’t have
to be split between tables.
NoSQL data models allow related data to be nested within a single data structure. An
important aspect of relationships between aggregates is how they handle updates. Aggregate
oriented databases treat the aggregate as the unit of data-retrieval. Consequently, atomicity is
only supported within the contents of a single aggregate. If you update multiple aggregates at
once, you have to deal yourself with a failure partway through. Relational databases help you
with this by allowing you to modify multiple records in a single transaction, providing ACID
guarantees while altering many rows. All of this means that aggregate-oriented databases
become more awkward as you need to operate across multiple aggregates.
Graph databases
Graph databases are an odd fish in the NoSQL pond.
Most NoSQL databases were inspired by the need to run on clusters, which led to aggregate
oriented data models of large records with simple connections.
Graph databases are motivated by a different frustration with relational databases and thus
have an opposite model—small records with complex interconnections.
NoSQL 24-06-2022
Graph databases specialize in capturing this sort of information—but on a much larger scale
than a readable diagram could capture.
This is ideal for capturing any data consisting of complex relationships such as social
networks, product preferences, or eligibility rules.
The fundamental data model of a graph database is very simple: nodes connected by edges
(also called arcs).
Beyond this essential characteristic there is a lot of variation in data models—in particular,
what mechanisms you have to store data in your nodes and edges.
A quick sample of some current capabilities illustrates this variety of possibilities:
FlockDB is simply nodes and edges with no mechanism for additional attributes; Neo4J
allows you to attach Java objects as properties to nodes and edges in a schemaless
fashion;
Infinite Graph stores your Java objects, which are subclasses of its built-in types, as nodes
and edges.
a graph database allows you to query that network with query operations designed. Graph
databases are purpose-built to store and navigate relationships. Relationships are first class
citizens in graph databases, and most of the value of graph databases is derived from these
relationships. Graph databases use nodes to store data entities, and edges to store relationships
between entities. An edge always has a start node, end node, type, and direction, and an edge
can describe parent-child relationships, actions, ownership, and the like. There is no limit to
the number and kind of relationships a node can have.
A graph in a graph database can be traversed along specific edge types or across the entire
graph. In graph databases, traversing the joins or relationships is very fast because the
NoSQL 24-06-2022
relationships between nodes are not calculated at query times but are persisted in the database.
Graph databases have advantages for use cases such as social networking, recommendation
engines, and fraud detection, when you need to create relationships between data and quickly
query these relationships.
Schemaless Databases
NoSQL databases are designed to store and query unstructured data, they do not require the
same rigid schemas used by relational databases.
Although a schema can be applied at the application level, NoSQL databases retain all of
your unstructured data in its original raw format.
This means that complete granularity is retained, even if you later change your application
schema — something that is simply not possible with a traditional SQL database.
The database management system (DBMS) enforces a partial schema as data is written,
explicitly listing collections and indexes.
The applications you use to leverage data stored in MongoDB will enforce a much stricter
dynamically typed schema as documents are read from the database.
By operating without a schema, schemaless databases can store, retrieve, and query
any data type — perfect for big data analytics and similar operations that are powered
By:Yojana Kiran Kumar,Asst. Professor,Dept of BVOC ,SDM
MODULE 1
NoSQL 24-06-2022
by unstructured data. Relational databases apply rigid schema rules to data, limiting
what can be stored.
The lack of schema means that your NoSQL database can accept any data type —
including those that you do not yet use. This future-proofs your database, allowing it
to grow and change as your data-driven operations change and mature.
∙ No data truncation
A schemaless database makes almost no changes to your data; each item is saved in its
own document with a partial schema, leaving the raw information untouched. This
means that every detail is always available and nothing is stripped to match the current
schema. This is particularly valuable if your analytics needs to change at some point in
the future.
With the ability to process unstructured data, applications built on NoSQL databases
are better able to process real-time data, such as readings and measurements from IoT
sensors. Schemaless databases are also ideal for use with machine learning and
artificial intelligence operations, helping to accelerate automated actions in your
business.
With NoSQL, you can use whichever data model is best suited to the job. Graph
databases allow you to view relationships between data points, or you can use
traditional wide table views with an exceptionally large number of columns. You can
query, report, and model information however you choose. And as your requirements
grow, you can keep adding nodes to increase capacity and power.
Materialized Views
For example, it may be a local copy of data located remotely, or may be a subset of the rows
and/or columns of a table or join result, or may be a summary using an aggregate function.
Views:
A View is a virtual relation that acts as an actual relation. It is not a part of logical
NoSQL 24-06-2022
relational model of the database system. Tuples of the view are not stored in the database
system and tuples of the view are generated every time the view is accessed. Query
expression of the view is stored in the databases system.
Views can be used everywhere were we can use the actual relation. Views can be used to
create custom virtual relations according to the needs of a specific user. We can create as
many views as we want in a databases system.
Materialized Views:
When the results of a view expression are stored in a database system, they are called
materialized views. SQL does not provides any standard way of defining materialized view,
however some database management system provides custom extensions to use materialized
views. The process of keeping the materialized views updated is know as view maintenance.
Database system uses one of the three ways to keep the materialized view updated:
Query expression are stored in the Resulting tuples of the query expression
databases system, and not the are stored in the databases system.
resulting tuples of the query
expression.
Views needs not to be updated every Materialized views are updated as the
time the relation on which view is tuples are stored in the database system.
defined is updated, as the tuples of the It can be updated in one of three ways
views are computed every time when depending on the databases system as
the view is accessed. mentioned above.
It does not have any storage cost It does have a storage cost associated
associated with it. with it.
NoSQL 24-06-2022
Views Materialized Views
It does not have any updation cost It does have updation cost associated
associated with it. with it.
Views are useful when the view is Materialized views are efficient when
accessed infrequently. the view is accessed frequently as it
saves the computation time by storing
the results before hand.
when modeling data aggregates we need to consider how the data is going to be read as well
as what are the side effects on data related to those aggregates. Let’s start with the model
where all the data for the customer is embedded using a key-value store
By:Yojana Kiran Kumar,Asst. Professor,Dept of BVOC ,SDM
MODULE 1
NoSQL 24-06-2022
In this scenario, the application can read the customer’s information and all the related data
by using the key. If the requirements are to read the orders or the products sold in each order,
the whole object has to be read and then parsed on the client side to build the results.
When references are needed, we could switch to document stores and then query inside the
documents, or even change the data for the key-value store to split the value object into
Customer and Order objects and then maintain these objects’ references to each other.
With the references we can now find the orders independently from the Customer, and with
the orderId reference in the Customer we can find all Orders for the Customer. Using
aggregates this way allows for read optimization, but we have to push the orderId reference
into Customer every time with a new Order.
NoSQL 24-06-2022
Aggregates can also be used to obtain analytics; for example, an aggregate update may fill in
information on which Orders have a given Product in them.
This denormalization of the data allows for fast access to the data we are interested in and is
the basis for Real Time BI or Real Time Analytics where enterprises don’t have to rely on
end-of-the-day batch runs to populate data warehouse tables and generate analytics; now they
can fill in this type of data, for multiple types of requirements, when the order is placed by the
customer
NoSQL 24-06-2022
In document stores, since we can query inside documents, removing references to Orders
from the Customer object is possible. This change allows us to not update the Customer
object when new orders are placed by the Customer.
Since document data stores allow you to query by attributes inside the document, searches
such as “find all orders that include the Refactoring Databases product” are possible, but the
decision to create an aggregate of items and orders they belong to is not based on the
database’s query capability but on the read optimization desired by the application.
When modeling for column-family stores, we have the benefit of the columns being ordered,
allowing us to name columns that are frequently used so that they are fetched first. When
using the column families to model the data, it is important to remember to do it per your
query requirements and not for the purpose of writing; the general rule is to make it easy to
query and denormalize the data during write.
NoSQL 24-06-2022
As you can imagine, there are multiple ways to model the data; one way is to store the
Customer and Order in different column-family families
Here, it is important to note the reference to all the orders placed by the customer are in the
Customer column family. Similar other denormalizations are generally done so that query
(read) performance is improved.
When using graph databases to model the same data, we model all objects as nodes and
relations within them as relationships; these relationships have types and directional
significance. Each node has independent relationships with other nodes.
these relationship names let you traverse the graph. Let’s say you want to find all the
Customers who PURCHASED a product with the name Refactoring Database.
All we need to do is query for the product node Refactoring Databases and look for all the
Customers with the incoming PURCHASED relationship.
NoSQL 24-06-2022
This type of relationship traversal is very easy with graph databases.
It is especially convenient when you need to use the data to recommend products to users or
to find patterns in actions taken by users.
Key Points
• Aggregate-oriented databases make inter-aggregate relationships more difficult to handle
than intra-aggregate relationships.
• Graph databases organize data into node and edge graphs; they work best for data that has
complex relationship structures.
• Schemaless databases allow you to freely add fields to records, but there is usually an
implicit schema expected by users of the data.
**********