Data Virtualization Patterns
Data Virtualization Patterns
Patterns
1. Introduction Agenda
2. Data Virtualization Patterns
3. Logical Data Warehouse
4. Big Data
5. Single View Applications
6. Enterprise Data Layer/Data Services
Introduction
Introduction
4
Data Virtualization Patterns
Denodo ‘Horizontal Solution’ Categories
6
Denodo ‘Horizontal Solution’ Categories
Customer Centricity
Customer / MDM
Centricity/MDM Data Governance Data Services
✓ Complete View of Customer ✓ GRC ✓ Data as a Service
✓ Complete View of Customer ✓ GDPR ✓ Data Marketplace
✓ Customer Service Unified Desktop ✓ Data Privacy / Masking ✓ Data Services
✓ Unified Desktop for Contact Center ✓ Application and Data Migration
7
Denodo ‘Horizontal Solution’ Categories
8
Denodo ‘Horizontal Solution’ Categories
9
Denodo ‘Horizontal Solution’ Categories
BI and
Customer Analytics
Centricity / MDM Data Governance Data Services
✓ Self-Service
✓ Complete Analytics
View of Customer ✓ GRC ✓ Data as a Service
✓ Self-Service Discovery ✓ GDPR ✓ Data Marketplace
✓ Data Privacy / Masking ✓ Data Services
✓ Self-Service Exploration
✓ Application and Data Migration
✓ Self-Service Collaboration
✓ Logical Data Warehouse
✓ Inventory-Sales Reconciliation Reports
✓ Logical Data Warehouse
BI and Analytics
✓ Agile Reporting using Logical Data Big Data Cloud Solutions
Warehouse
✓ Self-Service Analytics ✓ Cloud Modernization
✓ Logical Data Lake
✓ Enterprise
✓ Logical
Data Fabric
Data Warehouse ✓ Data Warehouse Offloading ✓ Cloud Analytics
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid Data Fabric
✓ Single View of Supply Chain
✓ Secure Data Services Layer
10
Denodo ‘Horizontal Solution’ Categories
Big Data
✓ Logical Data Lake
✓ Single View for Customer Analytics
BI and Analytics Big Data
✓ Data Warehouse Offloading Cloud Solutions
✓ Self-Service Analytics ✓ Cost
✓ Reduction
Logical Data Lake ✓ Cloud Modernization
✓ Logical Data Warehouse ✓ Cloud Analytics
✓ IoT ✓ Data Warehouse Offloading
Analytics
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid Data Fabric
✓ Contextual Data for Advanced Analytics
11
Denodo ‘Horizontal Solution’ Categories
12
Logical Data Warehouse
Logical Data Warehouse
Gartner definition
Description:
• “The Logical Data Warehouse (LDW) is a new data management architecture for analytics
combining the strengths of traditional repository warehouses with alternative data
management and access strategy. The LDW will form a new best practice by the end of
2015.”
• “The LDW is an evolution and augmentation of DW practices, not a replacement”
• “A repository-only style DW contains a single ontology/taxonomy, whereas in the LDW a
semantic layer can contain many combination of use cases, many business definitions of the
same information”
• “The LDW permits an IT organization to make a large number of datasets available for
analysis via query tools and applications”
14
Logical Data Warehouse
Gartner definition
Description:
• “The Logical Data Warehouse (LDW) is a new data management architecture for analytics
combining the strengths of traditional repository warehouses with alternative data
management and access strategy. The LDW will form a new best practice by the end of
2015.”
• “The LDW is an evolution and augmentation of DW practices, not a replacement”
• “A repository-only style DW contains a single ontology/taxonomy, whereas in the LDW a
semantic layer can contain many combination of use cases, many business definitions of the
same information”
• “The LDW permits an IT organization to make a large number of datasets available for
analysis via query tools and applications”
15
Logical Data Warehouse
Description:
• A semantic layer on top of the data warehouse that keeps the business data definition.
• The Data Virtualization layer allows distributed analytic processing, the data warehouse is
not the only analytical processing node.
• Allows the integration of multiple data sources including enterprise systems, the data
warehouse, additional processing nodes (analytical appliances, Big Data, …), Web, Cloud
and unstructured data.
• Multiple sub-patterns (in Appendix): Federation of Data Warehouses, Data Warehouse
Extension, Operational BI, etc.
• Publishes data to multiple applications and reporting tools.
16
Logical Data Warehouse
Sales by
Customer
and Region
∑
∞ U
HDFS Document
ERP Sales
Files Collections
17
Logical Data Warehouse Example
18
Logical Data Warehouse Example
19
Logical Data Warehouse Example
20
Logical Data Warehouse Example
21
Virtual Data Marts
Description:
• Data marts can be built in the virtual layer by combining and transforming imported views
from the data warehouse (or from the staging area in a Bus Architecture).
• No need for an additional storage node, data will be kept in the data warehouse.
• The Data Virtualization cache can be used to off-load the data warehouse processing to
guarantee a proper performance.
• The cache can be populated with most used data sets on a per-view basis, no need to cache all
data.
22
Virtual Data Marts
Sales by Product
∑ & Retailer in TTM
Retailer
Dimension
Product Dimension
Time Dimension Fact table
(sales)
PIM
Database
EDW
23
Virtual Data Marts Example
24
Virtual Sandboxes & Prototyping
Description:
• A virtual sandbox can be created to allow controlled access to corporate data sources and a
safe environment for data analysts to experiment with and integrate their personal data files
(e.g. log files, Hadoop, Excel, etc.).
• Avoids unfettered access to corporate data sources or copying ‘raw’ data sources into
the corporate data environment.
• Data analysts don’t need to request data extracts (e.g. Excel dumps), keeping their data fresh
and controlled.
• Virtual sandbox can be a separate Data Virtualization instance or a separate Virtual
Database (VDB) within a shared instance.
25
Virtual Sandboxes & Prototyping
∞
∞
Web Log
Files
ERP Sales
Text Excel
Files
Database EDW
Experimentation ‘Sandbox’
26
Big Data Lakes
Logical Data Lake
Description:
• Hadoop used for storing huge amounts of ‘natural’ data before being integrated
with ‘curated’ data
• Need to manage volume and complexity of ‘big data’ cost effectively
• Retain raw data for future analysis
• Use existing data analyst skill sets (SQL, Excel) to access and analyse data in Hadoop
• Example: Social media data (e.g. tweets, Facebook ‘likes’, etc.) or web clickstreams are
captured and stored in Hadoop and integrated with customer data stored in DW or CRM
28
Logical Data Lake
Sales by
Customer with
Positive Customer
∑
Feedback
CRM Hadoop
EDW
Cluster
29
Logical Data Lake Example
Discovery zone
presentation layer
Data Lake
Systems of record
30
Analytical Data Integration
Description:
• Analytics over the Big Data repository is carried out using a Hadoop SQL engine
(Spark, Cloudera Impala, Apache Drill, IBM Big SQL, etc.).
• Data Virtualization helps to integrate the analytical results with other information
in the enterprise (e.g. customer information from the CRM system, or even results
from another analytical engine).
• Other information provides ‘context’ for analytics results to provide more
meaningful information to business.
31
Analytical Data Integration
Integrated
Insights
∞
π π
PIM Sales
Database Hadoop
EDW
Cluster
32
Analytical Data Integration Example
Dealer
Maintenance
Parts Inventory
33
Data Warehouse Offloading
Description:
• Cold data (that is infrequently used) is copied to a Hadoop store (cheaper solution).
• Data Virtualization allows combined reports from current data from the DW and
cold data from the Hadoop store.
• Example - Sales Data Warehouse: sales data older than 2 years is moved to Hadoop, more
recent data is kept in the data warehouse.
• For consuming applications the hybrid data warehouse appears as a single data
store.
• The Data Virtualization engine applies query partitioning techniques to ignore
unnecessary query branches that are removed for better performance.
34
Data Warehouse Offloading
BI / Reporting
Analytics
BI / Reporting
Analytics
Dimension 1
(product)
Fact table Fact table (sales
(sales < 2 years old) > 2 years old)
Fact table
(sales) Dimension 2
(country)
35
Single View Applications
Single View of Customer
Description:
• The Data Virtualization layer offers unified views of business entities.
• Usually the information about critical business entities is spread across many
information silos.
• e.g. “Customer” information from the CRM system, the billing system, the provisioning
system, etc.
• Mix of operational systems and informational systems
• ‘Single view’ is usually built on the fly, although cache can be enabled to increase
performance.
• Example: Single view of customer to call center agents to better handle customer
enquiries.
37
Single View of Customer
Enterprise Apps
∞ ∞
Customer Address Products Billing Logs Incidents
Details
38
Single View of Customer Example
39
Enterprise Data Layer/Data
Services
Enterprise Data Layer
Description:
• Provide data access layer so that different users across the enterprise can access
data in a secure and managed fashion and share a common data ‘model’
• Provide secure and managed access to data across the enterprise
• Provide consistency of data
• Hide complexity, format, and location of actual data sources
• Support many consumption protocols and patterns
• Example: Single data access layer for all development teams to avoid ‘hunting down
and interpreting data differently by project’
41
Enterprise Data Layer
Enterprise Apps
∞ U
42
Enterprise Data Layer Example
43
Enterprise Data Marketplace
Description:
• A version of an Enterprise Data Access Layer, but users can ‘subscribe’ to data via
the marketplace
• Provide secure and managed access to data across the enterprise
• Hide complexity, format, and location of actual data sources
• Provide a ‘search and subscribe’ mechanism to allow users to find and access to data
services
• Example: An organization provisions access to data services via a managed and
secure marketplace. The services are provided by a Data Virtualization layer.
44
Thanks!
www.denodo.com [email protected]
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm,
without prior the written authorization from Denodo Technologies.