0% found this document useful (0 votes)
36 views3 pages

Aggregate Models in Big Data

Uploaded by

vishal.gahlot14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views3 pages

Aggregate Models in Big Data

Uploaded by

vishal.gahlot14
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 3

Aggregate Models In Big Data

Submitted By: Gurmohit Singh

SID: 18205002

Model Key-value
Description A key-value database (also known as a key-value store and key-value store database) is a
type of NoSQL database that uses a simple key/value method to store data.

The key-value part refers to the fact that the database stores data as a collection of
key/value pairs. This is a simple method of storing data, and it is known to scale well.

The key-value pair is a well established concept in many programming languages.


Programming languages typically refer to a key-value as an associative array or data
structure. A key-value is also commonly referred to as a dictionary or hash.
Pros Simple data format makes write and read operations fast

Value can be anything, including JSON, flexible schemas


Cons Optimized only for data with single key and value. A parser is required to store multiple
values.

Not optimized for lookup. Lookup requires scanning the whole collection or creating
separate index values
Good For  User profiles
 Session information
 Article/blog comments
 Emails
 status messages
Supported  Redis
DBMS  Oracle Nosql DB

Model Document
Description A document database is a type of nonrelational database that is designed to store and
query data as JSON-like documents.
Document databases make it easier for developers to store and query data in a database by
using the same document-model format they use in their application code. The flexible,
semistructured, and hierarchical nature of documents and document databases allows
them to evolve with applications’ needs.

The document model works well with use cases such as catalogs, user profiles, and
content management systems where each document is unique and evolves over time.
Document databases enable flexible indexing, powerful ad hoc queries, and analytics over
collections of documents.
Pros Add nodes on the fly with advantage of scalability (mongo detect them as you add)
Rich set of client libraries
Uses BSON (superset of JSON which is easy to deal with)
Great speed if your inserts are not failsafe (which is on by default - use case logging)
Indexing fields is easy (if you need speed at some field just index it, mongodb allows you
do that easily)
Geospatial indexing and querying
Cons No joins
Less flexible queries
For complex jobs you need Map-Reduce
May face with unexpected failures (generally not mongo's fault - wrong setup,config etc.)
You are good as long as your index fits into memory (memory mapped files)
Using single node is dangereous (you may lost your data)
Good For  Content management
 Catalogs
Supported  MongoDB. © MongoDB. ...
DBMS  Apache Cassandra. © Apache Software Foundation. ...
 Amazon DynamoDB. ...
 Couchbase. ...

Model Column- family


Description A columnar database is a database management system (DBMS) that stores data in
columns instead of rows.

The goal of a columnar database is to efficiently write and read data to and from hard
disk storage in order to speed up the time it takes to return a query.

In a columnar database, all the column 1 values are physically together, followed by all
the column 2 values, etc. The data is stored in record order, so the 100th entry for
column 1 and the 100th entry for column 2 belong to the same input record. This allows
individual data elements, such as customer name for instance, to be accessed in columns
as a group, rather than individually row-by-row.
Pros Columnar databases have been traditionally developed with horizontal scalability
as a primary design goal. As such, they’re particularly suited to “Big
“Data” problems, living on clusters of tens, hundreds, or thousands of nodes.
They also tend to have built-in support for features such as compression and
versioning. The canonical example of a good columnar data storage problem
is indexing web pages. Pages on the Web are highly textual (benefits from
compression), somewhat interrelated, and change over time (benefits from
versioning).

Cons Different columnar databases have different features and therefore different
drawbacks. But one thing they have in common is that it’s best to design
your schema based on how you plan to query the data. This means you should
have some idea in advance of how your data will be used, not just what it’ll
consist of. If data usage patterns can’t be defined in advance—for example,
fast adhoc reporting—then a columnar database may not be the best fit.

Good For Large organisations that need to make the most


Supported C-Store
DBMS MonetDb
LucidDb

Model Graph based


Description Graph databases are purpose-built to store and navigate relationships. Relationships are
first-class citizens in graph databases, and most of the value of graph databases is derived
from these relationships. Graph databases use nodes to store data entities, and edges to
store relationships between entities. An edge always has a start node, end node, type, and
direction, and an edge can describe parent-child relationships, actions, ownership, and the
like. There is no limit to the number and kind of relationships a node can have.

A graph in a graph database can be traversed along specific edge types or across the entire
graph. In graph databases, traversing the joins or relationships is very fast because the
relationships between nodes are not calculated at query times but are persisted in the
database.
Pros Graph databases seem to be tailor-made for networking applications. The prototypical
example is a social network, where nodes represent users who have various kinds of
relationships to each other. Modeling this kind of data using any of the other styles is
often a tough fit, but a graph database would accept it with relish.

They are also perfect matches for an object-oriented system.


Cons Because of the high degree of interconnectedness between nodes, graph databases are
generally not suitable for network partitioning.

Graph databases don’t scale out well.


Good For Fraud Detection
Recommendation Engines
Supported  Neo4J
DBMS  OrientDb
 Dgraph

You might also like