0% found this document useful (0 votes)
29 views

Data Virtualization Patterns

Uploaded by

Hamza Nasri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views

Data Virtualization Patterns

Uploaded by

Hamza Nasri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Data Virtualization

Patterns
1. Introduction Agenda
2. Data Virtualization Patterns
3. Logical Data Warehouse
4. Big Data
5. Single View Applications
6. Enterprise Data Layer/Data Services
Introduction
Introduction

Data Virtualization is an Essential Data Integration Tool.


Key Utility Tool in your Toolkit on many projects for tactical needs.
DV usage usually starts as a tactical project, evolving project by project to eventually
become an strategic Common Data Layer in the enterprise.
There are different Data Virtualization usage patterns – industry best practices.

4
Data Virtualization Patterns
Denodo ‘Horizontal Solution’ Categories

Customer Centricity / MDM Data Governance Data Services


✓ Complete View of Customer ✓ GRC ✓ Data as a Service
✓ GDPR ✓ Data Marketplace
✓ Data Privacy / Masking ✓ Data Services
✓ Application and Data Migration

BI and Analytics Big Data Cloud Solutions


✓ Self-Service Analytics ✓ Logical Data Lake ✓ Cloud Modernization
✓ Logical Data Warehouse ✓ Data Warehouse Offloading ✓ Cloud Analytics
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid Data Fabric

6
Denodo ‘Horizontal Solution’ Categories

Customer Centricity
Customer / MDM
Centricity/MDM Data Governance Data Services
✓ Complete View of Customer ✓ GRC ✓ Data as a Service
✓ Complete View of Customer ✓ GDPR ✓ Data Marketplace
✓ Customer Service Unified Desktop ✓ Data Privacy / Masking ✓ Data Services
✓ Unified Desktop for Contact Center ✓ Application and Data Migration

✓ Customer Self-Service Portal


✓ Single Customer View for Back Office
Automation

BI and Analytics Big Data Cloud Solutions


✓ Self-Service Analytics ✓ Logical Data Lake ✓ Cloud Modernization
✓ Logical Data Warehouse ✓ Data Warehouse Offloading ✓ Cloud Analytics
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid Data Fabric

7
Denodo ‘Horizontal Solution’ Categories

Customer Centricity / MDM DataGovernance


Data Governance Data Services
✓ Complete View of Customer ✓ GRC ✓ Data as a Service
✓ GRC ✓ GDPR ✓ Data Marketplace
✓ Data Privacy
✓ Data Retention / Masking
for Regulatory Compliance ✓ Data Services
✓ Risk Reporting for Basel III Compliance ✓ Application and Data Migration

✓ Single View of Risk


✓ GDPR
✓ Data Privacy and Protection
BI and Analytics ✓ Data Privacy/Masking
Big Data Cloud Solutions
✓ Data Privacy in a Hybrid Environment ✓ Cloud Modernization
✓ Self-Service Analytics ✓ Logical Data Lake
✓ Logical Data Warehouse ✓ De-identifying
✓ Data Warehouse
Patient Data according to
Offloading ✓ Cloud Analytics
✓ Enterprise Data Fabric ✓ Safe
HIPAA IoT Harbor Rules
Analytics ✓ Hybrid Data Fabric

8
Denodo ‘Horizontal Solution’ Categories

Data Services & Catalog


Customer Centricity / MDM Data Governance ✓ Data as a Service
Data Services
✓ Complete View of Customer ✓ GRC ✓ Data
✓ Services
Data as afor Drug Discovery
Service
✓ GDPR ✓ Data
✓ Unified Marketplace
Data Services Layer
✓ Data Privacy / Masking ✓ Data Services
✓ Enterprise Data Service Layer
✓ Application and Data Migration
✓ Data Marketplace
✓ Data Access Marketplace
✓ Liquidity Management Dashboard
✓ Data Services
BI and Analytics Big Data Cloud
✓ Cable Set Top BoxSolutions
Transaction Management
✓ Self-Service Analytics ✓ Logical Data Lake ✓ Web
✓ RESTful Cloud Modernization
Services API for Development
✓ Logical Data Warehouse ✓ Data Warehouse Offloading Teams✓ Cloud Analytics
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid Data Fabric
✓ Application and Data Migration
✓ Migration Abstraction Layer
✓ Mergers and Acquisitions

9
Denodo ‘Horizontal Solution’ Categories

BI and
Customer Analytics
Centricity / MDM Data Governance Data Services
✓ Self-Service
✓ Complete Analytics
View of Customer ✓ GRC ✓ Data as a Service
✓ Self-Service Discovery ✓ GDPR ✓ Data Marketplace
✓ Data Privacy / Masking ✓ Data Services
✓ Self-Service Exploration
✓ Application and Data Migration
✓ Self-Service Collaboration
✓ Logical Data Warehouse
✓ Inventory-Sales Reconciliation Reports
✓ Logical Data Warehouse
BI and Analytics
✓ Agile Reporting using Logical Data Big Data Cloud Solutions
Warehouse
✓ Self-Service Analytics ✓ Cloud Modernization
✓ Logical Data Lake
✓ Enterprise
✓ Logical
Data Fabric
Data Warehouse ✓ Data Warehouse Offloading ✓ Cloud Analytics
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid Data Fabric
✓ Single View of Supply Chain
✓ Secure Data Services Layer

10
Denodo ‘Horizontal Solution’ Categories

Customer Centricity / MDM Data Governance Data Services


✓ Complete View of Customer ✓ GRC ✓ Data as a Service
✓ GDPR ✓ Data Marketplace
✓ Data Privacy / Masking ✓ Data Services
✓ Application and Data Migration

Big Data
✓ Logical Data Lake
✓ Single View for Customer Analytics
BI and Analytics Big Data
✓ Data Warehouse Offloading Cloud Solutions
✓ Self-Service Analytics ✓ Cost
✓ Reduction
Logical Data Lake ✓ Cloud Modernization
✓ Logical Data Warehouse ✓ Cloud Analytics
✓ IoT ✓ Data Warehouse Offloading
Analytics
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid Data Fabric
✓ Contextual Data for Advanced Analytics

11
Denodo ‘Horizontal Solution’ Categories

Customer Centricity / MDM Data Governance Cloud


Data Solutions
Services
✓ Complete View of Customer ✓ GRC ✓ Cloud
✓ Data
Modernization
as a Service
✓ GDPR ✓ ✓Application
Data Marketplace
Modernization
✓ Data Privacy / Masking ✓ Data Services
✓ ✓Cloud Migrationand Data Migration
Application
✓ Cloud Analytics
✓ Analytics in the Cloud
✓ Web/Cloud/Semi-Structured Data
Integration
BI and Analytics Big Data Cloud
✓ Hybrid Data Solutions
Fabric
✓ Self-Service Analytics ✓ Logical Data Lake ✓ Single
✓ View
CloudofModernization
Customer for Distributor
✓ Logical Data Warehouse ✓ Data Warehouse Offloading ✓ Cloud Analytics
Portal
✓ Enterprise Data Fabric ✓ IoT Analytics ✓ Hybrid
✓ Automation of Data
data Fabric
Service for digital
programs

12
Logical Data Warehouse
Logical Data Warehouse
Gartner definition

Description:
• “The Logical Data Warehouse (LDW) is a new data management architecture for analytics
combining the strengths of traditional repository warehouses with alternative data
management and access strategy. The LDW will form a new best practice by the end of
2015.”
• “The LDW is an evolution and augmentation of DW practices, not a replacement”
• “A repository-only style DW contains a single ontology/taxonomy, whereas in the LDW a
semantic layer can contain many combination of use cases, many business definitions of the
same information”
• “The LDW permits an IT organization to make a large number of datasets available for
analysis via query tools and applications”

14
Logical Data Warehouse
Gartner definition

Description:
• “The Logical Data Warehouse (LDW) is a new data management architecture for analytics
combining the strengths of traditional repository warehouses with alternative data
management and access strategy. The LDW will form a new best practice by the end of
2015.”
• “The LDW is an evolution and augmentation of DW practices, not a replacement”
• “A repository-only style DW contains a single ontology/taxonomy, whereas in the LDW a
semantic layer can contain many combination of use cases, many business definitions of the
same information”
• “The LDW permits an IT organization to make a large number of datasets available for
analysis via query tools and applications”

15
Logical Data Warehouse

Description:
• A semantic layer on top of the data warehouse that keeps the business data definition.
• The Data Virtualization layer allows distributed analytic processing, the data warehouse is
not the only analytical processing node.
• Allows the integration of multiple data sources including enterprise systems, the data
warehouse, additional processing nodes (analytical appliances, Big Data, …), Web, Cloud
and unstructured data.
• Multiple sub-patterns (in Appendix): Federation of Data Warehouses, Data Warehouse
Extension, Operational BI, etc.
• Publishes data to multiple applications and reporting tools.

16
Logical Data Warehouse

Sales by
Customer
and Region

∞ U

HDFS Document
ERP Sales
Files Collections

Database EDW Hadoop NoSQL Excel


Cluster Database

17
Logical Data Warehouse Example

18
Logical Data Warehouse Example

19
Logical Data Warehouse Example

20
Logical Data Warehouse Example

21
Virtual Data Marts

Description:
• Data marts can be built in the virtual layer by combining and transforming imported views
from the data warehouse (or from the staging area in a Bus Architecture).
• No need for an additional storage node, data will be kept in the data warehouse.
• The Data Virtualization cache can be used to off-load the data warehouse processing to
guarantee a proper performance.
• The cache can be populated with most used data sets on a per-view basis, no need to cache all
data.

22
Virtual Data Marts

Sales by Product
∑ & Retailer in TTM

Retailer
Dimension
Product Dimension
Time Dimension Fact table
(sales)

PIM
Database
EDW

23
Virtual Data Marts Example

24
Virtual Sandboxes & Prototyping

Description:
• A virtual sandbox can be created to allow controlled access to corporate data sources and a
safe environment for data analysts to experiment with and integrate their personal data files
(e.g. log files, Hadoop, Excel, etc.).
• Avoids unfettered access to corporate data sources or copying ‘raw’ data sources into
the corporate data environment.
• Data analysts don’t need to request data extracts (e.g. Excel dumps), keeping their data fresh
and controlled.
• Virtual sandbox can be a separate Data Virtualization instance or a separate Virtual
Database (VDB) within a shared instance.

25
Virtual Sandboxes & Prototyping


Web Log
Files

ERP Sales

Text Excel
Files
Database EDW
Experimentation ‘Sandbox’

Secure & Controlled Data Environment

26
Big Data Lakes
Logical Data Lake

Description:
• Hadoop used for storing huge amounts of ‘natural’ data before being integrated
with ‘curated’ data
• Need to manage volume and complexity of ‘big data’ cost effectively
• Retain raw data for future analysis
• Use existing data analyst skill sets (SQL, Excel) to access and analyse data in Hadoop
• Example: Social media data (e.g. tweets, Facebook ‘likes’, etc.) or web clickstreams are
captured and stored in Hadoop and integrated with customer data stored in DW or CRM

28
Logical Data Lake

Sales by
Customer with
Positive Customer

Feedback

Customer Sales Data in HDFS Files


Data in HDF S Files
Data in H DFS Files

CRM Hadoop
EDW
Cluster

29
Logical Data Lake Example

Discovery zone
presentation layer

Virtual data views

Data Lake

Systems of record

30
Analytical Data Integration

Description:
• Analytics over the Big Data repository is carried out using a Hadoop SQL engine
(Spark, Cloudera Impala, Apache Drill, IBM Big SQL, etc.).
• Data Virtualization helps to integrate the analytical results with other information
in the enterprise (e.g. customer information from the CRM system, or even results
from another analytical engine).
• Other information provides ‘context’ for analytics results to provide more
meaningful information to business.

31
Analytical Data Integration

Integrated
Insights


π π

PIM Sales

Database Hadoop
EDW
Cluster

32
Analytical Data Integration Example

Tableau: Dealer / Customer Dashboard

Dealer

Maintenance

Parts Inventory

OSI PI Hadoop Cluster

33
Data Warehouse Offloading

Description:
• Cold data (that is infrequently used) is copied to a Hadoop store (cheaper solution).
• Data Virtualization allows combined reports from current data from the DW and
cold data from the Hadoop store.
• Example - Sales Data Warehouse: sales data older than 2 years is moved to Hadoop, more
recent data is kept in the data warehouse.
• For consuming applications the hybrid data warehouse appears as a single data
store.
• The Data Virtualization engine applies query partitioning techniques to ignore
unnecessary query branches that are removed for better performance.

34
Data Warehouse Offloading

BI / Reporting
Analytics

BI / Reporting
Analytics

Dimension 1
(product)
Fact table Fact table (sales
(sales < 2 years old) > 2 years old)
Fact table
(sales) Dimension 2
(country)

DataWarehouse DataWarehouse Hadoop

35
Single View Applications
Single View of Customer

Description:
• The Data Virtualization layer offers unified views of business entities.
• Usually the information about critical business entities is spread across many
information silos.
• e.g. “Customer” information from the CRM system, the billing system, the provisioning
system, etc.
• Mix of operational systems and informational systems
• ‘Single view’ is usually built on the fly, although cache can be enabled to increase
performance.
• Example: Single view of customer to call center agents to better handle customer
enquiries.

37
Single View of Customer
Enterprise Apps

∞ ∞
Customer Address Products Billing Logs Incidents
Details

Customer Products Billing Network Trouble


Logs Tickets

CRM Network Finance Network Support

38
Single View of Customer Example

39
Enterprise Data Layer/Data
Services
Enterprise Data Layer

Description:
• Provide data access layer so that different users across the enterprise can access
data in a secure and managed fashion and share a common data ‘model’
• Provide secure and managed access to data across the enterprise
• Provide consistency of data
• Hide complexity, format, and location of actual data sources
• Support many consumption protocols and patterns
• Example: Single data access layer for all development teams to avoid ‘hunting down
and interpreting data differently by project’

41
Enterprise Data Layer

Enterprise Apps

SQL(JDBC/ODBC), RESTful Web Services, SOAP,JMS, etc.

∞ U

Operational Analytical BigData External/SaaS


Systems Systems Systems

42
Enterprise Data Layer Example

43
Enterprise Data Marketplace

Description:
• A version of an Enterprise Data Access Layer, but users can ‘subscribe’ to data via
the marketplace
• Provide secure and managed access to data across the enterprise
• Hide complexity, format, and location of actual data sources
• Provide a ‘search and subscribe’ mechanism to allow users to find and access to data
services
• Example: An organization provisions access to data services via a managed and
secure marketplace. The services are provided by a Data Virtualization layer.

44
Thanks!

www.denodo.com [email protected]
© Copyright Denodo Technologies. All rights reserved
Unless otherwise specified, no part of this PDF file may be reproduced or utilized in any for or by any means, electronic or mechanical, including photocopying and microfilm,
without prior the written authorization from Denodo Technologies.

You might also like