0% found this document useful (0 votes)
40 views17 pages

MODULE-1 Data at Rest Vs Data in Motion

Uploaded by

Bhavya Bajaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
40 views17 pages

MODULE-1 Data at Rest Vs Data in Motion

Uploaded by

Bhavya Bajaj
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 17

Data at rest vs data in motion

NOSQL
F O R M . S C . ( D ATA S C I E N C E )
B Y D R . M A N I S H A M . PAT I L
Topics
Big Data Overview
What is NoSQL
Types of Data
Types of NoSQL
Advantages of NoSQL
Why NoSQL
Data at rest vs data in motion
What is NoSQL
Under the vast umbrella of Big Data lies a number of confusing terminologies and acronyms. One among

them, which is attracting a huge amount of interest, is the NoSQL database.

The Not Only SQL or NoSQL database is an approach that works towards managing data as well as

database design which may come in handy for huge sets of distributed data.

It consists of a number of technologies and architectures that seek a solution to the big data performance

issues and scalability that cannot be addressed by relational databases. This database is used when

companies and enterprises develop a need to access and analyze large amounts of unstructured data or the

data stored in multiple virtual servers in the cloud.


What is NoSQL

There is no specific definition for what NoSQL is, but a set of common observations can describe

it:
•Not using the relational model

•Running well on clusters

•Mostly open-source

•Built for the 21st century web estates

•Schema-less
The Types of NoSQL
Key Value Databases:
From the perspective of an API, key-value databases are uncomplicated data stores. The client can either put in a value for a specific key,
get a value from a specific key, or delete a specific key from the data store. The key values have primary access thus having easy scalability
and great performance.

Document Databases:
The concept that is focused on in document databases is the documents. The documents that are stored and received from the database
stores can be in BSON, XML, JSON, etc. the documents are usually similar to each other and are in a hierarchical tree data structure that is
self-describing and consist of scalar values, maps, and collections.

Column family stores:


These are the databases that store data in what are called column families. They have rows and a number of columns that are associated
with a row key. This is a bunch of data that is related and can be accessed together

Graph databases:
With the graph database comes the storage of nodes or entities and the relationship between these nodes.
The advantages of NoSQL:

•High scalability
•Distributed Computing
•Lower cost
•Schema flexibility
•Un/semi-structured data
•No complex relationships
Database Ranking
Need of NoSQL Skills
However, a survey conducted by Dice on the number of job postings showed 1,133 Hadoop-
related jobs and 927 NoSQL jobs on Dice’s board.

These stats show an inclination toward the importance and the need companies give to the
database management skill which is directly attributed to the increasing popularity and growth
of big data.
Thus, proving the importance of having a skillset in NoSQL. Want to stay relevant and be ready
for the NoSQL wave to hit?
Data at rest vs data in motion

• Gaining insights from big data is no small task. Having the right technology in
place to collect, manage and analyze data for predictive purposes or real-
time insight is critical.

• Different types of data may require different computing platforms to provide


meaningful insights.

• Understanding the difference between data in motion vs. data at rest can
help determine the type of technology and processing capabilities required
to glean insights from the data.
Data at rest
This refers to data that has been collected from various sources and is then analyzed after the
event occurs.
The point where the data is analyzed and the point where the action is taken on it occurs at two
separate times.
For example, a retailer analyzes a previous month’s sales data and uses it to make strategic
decisions about the present month’s business activities.
The action takes place after the data-creating event has occurred.
This data is meaningful to the retailer and allows them to create marketing campaigns and send
customized coupons based on customer purchasing behavior and other variables.
While the data provides value, the business impact is dependent on the customer coming back to
the store to take advantage of the offers.
Data in motion
The collection process for data in motion is similar to that of data at rest;
however, the difference lies in the analytics.
In this case, the analytics occur in real-time as the event happens.
An example here would be a theme park that uses wristbands to collect data about its guests.
These wristbands would constantly record data about the guest’s activities, and the park could
use this information to personalize the guest visit with special surprises or suggested activities
based on their behavior.
This allows the business to customize the guest experience during the visit.
Organizations have a tremendous opportunity to improve business results in these scenarios.
Infrastructure for data processing
You might be wondering what type of IT Infrastructure would be needed to support data processing for both of these types.
The answer depends on which method you choose, and your business objectives for the data.

For data at rest, a batch processing method would be most likely. In this case, you could spin up a bare-metal server during
the time you need to analyze the data and shut it back down when you are done. With no need for “always on” infrastructure,
this approach provides access to high-performance processing capabilities as needed.

For data in motion, you’d want to utilize a real-time processing method. In this case, latency becomes a key consideration
because a lag in processing could result in a missed opportunity to improve business results. By eliminating the resource
constraints of multi-tenancy, the bare-metal cloud offers reduced latency and high-performance levels, making it a good
choice for processing large volumes of high-velocity data in real time.

Both types of data have their advantages and can provide meaningful insights for your business. Determining the right
processing method and infrastructure depends on the requirements for your specific use case and data strategy.

You might also like