0% found this document useful (0 votes)
22 views30 pages

Critical Data Warehouse Trends

The document discusses several trends in data warehousing including self-service analytics, real-time data processing, machine learning, data virtualization, metadata management, columnar storage, in-memory processing, and multiplatform/multi-cloud capabilities. It provides examples and definitions for each trend.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views30 pages

Critical Data Warehouse Trends

The document discusses several trends in data warehousing including self-service analytics, real-time data processing, machine learning, data virtualization, metadata management, columnar storage, in-memory processing, and multiplatform/multi-cloud capabilities. It provides examples and definitions for each trend.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Critical Data

Warehouse Trends
Self-Service Analytics:
This is the ability for the user to access and utilize available resources
(e.g. storage, compute, memory) so that they can acquire, profile,
wrangle and analyze data (structured or unstructured) for some
analytical purpose on their own.
Self-service analytics is a common buzzword for many organizations
which desire to be more data driven and less dependent on IT for their
data needs. Organizations are increasingly implementing self-service
capabilities to enable and promote a data-driven culture within their
organizations.
Real-time data is the process of analyzing
data to create insights in real time. When raw
data is received, it is immediately processed
to empower near-instant decision-making.
Instead of being stored, it is made available to
promote insights as quickly as possible,
furthering organizations’ profitability,
efficiency, and business outcomes.
Some real-world applications of real-time processing
are found in banking systems, data streaming, customer
service structures, and weather radars. Without
real-time processing, these industries would not be
possible or would deeply lack accuracy.

For example, weather radar is heavily reliant on the


real-time insights provided by this system of data
processing. Due to the sheer volume of data that is
being collected by supercomputers to study weather
interactions and predictions, real-time processing is
absolutely critical to successful interpretation.
The integration of machine learning into artificial intelligence
has revolutionized how businesses function. It has helped
improve business processes' accuracy and efficiency by
providing valuable insights through data analysis.

Machine learning is a subset of artificial intelligence that


focuses on developing algorithms and models that enable
computers to learn and improve from experience without
being explicitly programmed. It allows machines to
understand and analyze patterns and make predictions and
decisions based on those patterns. This capability has been a
game-changer for businesses across various industries.
Real life examples of Machine Learning (MI):

1. Facial recognition
Facial recognition is one of the more obvious
applications of machine learning. People previously
received name suggestions for their mobile photos and
Facebook tagging, but now someone is immediately
tagged and verified by comparing and analyzing
patterns through facial contours.
Real life examples of Machine Learning (MI):

2. Product recommendations
Do you wonder how Amazon or other retailers
frequently know what you might like to purchase? Or,
have they gotten it wildly wrong and you wonder how
they came up with the recommendation? Thank
machine learning. Targeted marketing with retail uses
machine learning to group customers based on buying
habits or demographic similarities, and by extrapolating
what one person may want from someone else’s
purchases.
Real life examples of Machine Learning (MI):
3. Email automation and spam filtering
While your inbox seems relatively boring, machine learning
influences its function behind the scenes. Email automation is
a direct result of successful machine learning, and one
function that goes most unnoticed is spam filtering.
Successful spam filtering adapts and finds patterns in email
content that is undesirable. This includes data from email
domains, a sender’s physical location message text and
structure, and IP addresses. It also requires help from users
as they mark emails when they’re mistakenly filed. With each
marked email, a new data reference is added that helps with
future accuracy.
Data virtualization is an umbrella term used to describe an approach to
data management that allows an application to retrieve and manipulate
data without requiring technical details about the data. This can include
how the data is formatted or where it is physically located. The goal of
data virtualization is to create a single representation of data from
multiple, disparate sources without having to copy or move the data.

Data virtualization software aggregates structured and unstructured


data sources for virtual viewing through a dashboard or visualization
tool.
Metadata is simply data about data. It means it is a description and context of
the data. It helps to organize, find and understand data.

Metadata management refers to the organization and control of data


which describes technical, business, or operational aspects of other
data. It involves a range of processes, policies, and technologies which
describe and give meaning to your data via searchable key attributes
such as order number or customer ID.
Enterprise metadata management helps find the data needed and to
trust that that data is accurate. The company likely has a large volume
of complex data coming from many sources. And you need to be able
to find, understand and trust the right information to gain actionable
insights that improve your business.
Ultimately, managed metadata makes it easier for all types of users to
find, understand, and access the specific information assets they need.
Columnar storage (also known as column-oriented or c-store) is a
data storage technique that organizes and stores data by columns. It
is used for data warehousing and big data analytics, where fast
query performance and efficient data compression are essential.

In a columnar database, each column of a table is stored separately,


with all values from that column grouped together. This means that
individual data elements of a particular attribute, such as “Name” or
“Age,” are stored together.

This is in contrast to traditional row-oriented databases, where each


row is stored contiguously, including all attributes of that row.
In-Memory Processing
In-memory processing is the practice of taking action on data
entirely in computer memory (e.g., in RAM). This is in contrast to
other techniques of processing data which rely on reading and
writing data to and from slower media such as disk drives.
In-memory processing typically implies large-scale environments
where multiple computers are pooled together so their collective
RAM can be used as a large and fast storage medium. Since the
storage appears as one big, single allocation of RAM, large data sets
can be processed all at once, versus processing data sets that only
fit into the RAM of a single computer.
Multiplatform typically means capable of running on two or more
different hardware platforms. For example, versions of software
available for the Windows and Mac desktop environments are
multiplatform as is software that is available for iOS and Android
mobile devices. An interpreter is very often multiplatform.
Although the source code may be the same, the interpreter
runtime engines are available for two or more hardware
platforms.

Multi-cloud is the utilization of two or more public cloud


providers to serve an organization’s IT services and
infrastructure. There is no single multi-cloud vendor.

Usually the reason for the multi-cloud model is that a single


vendor is not able to perfectly meet all needs of an enterprise.
With several cloud providers, a company can also avoid data
THANK
YOU

You might also like