How To Define Build and Operationalize A Data Fabric
How To Define Build and Operationalize A Data Fabric
Essentials: How to
Define, Build and
Operationalize a
Data Fabric
Ehtisham Zaidi
© 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This presentation, including all supporting materials,
is proprietary to Gartner, Inc. and/or its affiliates and is for the sole internal use of the intended recipients. Because this presentation may contain information that is confidential,
proprietary or otherwise legally protected, it may not be further copied, distributed or publicly displayed without the express written permission of Gartner, Inc. or its affiliates.
Key Issues
1. The What and the Why — Defining a data fabric design in a way that
is understood by business teams
2. The How — 10 steps to stitch together your data fabric design to
automate your data management infrastructure
• Three paths to operationalizing your data fabric
3. The Where — Navigating the complex vendor landscape to select
mature technology components
2 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The What and the Why:
Defining the Data
Fabric Design
3 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Data Fabric Delivers Integrated Data to All Data
Consumers
Compounds Customers Products Claims
Data Fabric
Flat Third Legacy Data Warehouses/Marts Cloud Data Warehouses, XML, JSON, AVRO,
Files Party Cloud Data Lakes PDF, DOC, WEB
Hadoop, File Stores
4 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
The Evolution of Data Architecture: So, How Did We
Get Here?
2000s, Post-EDW Era 2010s, LDW Era 2020s, Active Metadata Era
Fragmented Analysis Unified Analysis Augmented Analysis
Apps
Success of EDW
Data, AI Orchestration
Adaptive Practices
Common Semantic Layer Common Semantic Layer
Data Marts
Data Data Data Data Data Data
Warehouses Marts Lakes Warehouses Marts Lakes
Custom Sandboxes Operational Custom Operational Operational
Data Stores Sandboxes Data Stores Databases
Metadata Analysis
Source: Data Fabric or Data Mesh: How to Decide Your Future Data Management Architecture (G00770696)
Note: EDW = Enterprise Data Warehouse
LDW = Logical Data Warehouse
5 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The What? Let’s Define Data Fabric
for attaining …
that utilizes … Faster and, in some cases, automated
in support of … data access and sharing
regardless of ...
6 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Data Fabric =
Metadata Analysis
+
Recommendations
It acts as an intelligent orchestration engine!
7 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data Management Overpowers What Should Be
a CDO Focus on Business Goals
Systems and
Processes
Business
Goals
Data
Technical Control
8 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Offloading Data Management Permits a Focus
on Business Goals in the Digital Business
And
Data
• Become the “dynamic designer”
• “Orchestration engine” for data
management Technical
Control
9 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Data Fabric Listens, Learns and Acts on Metadata
Participating
Data Consumers Other Systems
Systems
Data fabric applies continuous analytics over existing, discoverable and inferenced metadata
assets. By enriching the semantics of the underlying data, it generates alerts and
recommendations that can be actioned by people and systems.
10 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The Why: A. There Is Something for Everyone
Business Users
The Enterprise
11 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The Why: B. Automation Is Inevitable!
Expected Reduction in Human Effort Improve Data Utilization by 2025
Illustrative Illustrative
Relative Benefit
Automation Will
Relative Effort
200% 200%
0% 0%
Design Deployment Support Quality/Mastering Utilization
12 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The Why: C. Traditional Integration Approaches:
Challenges Galore!
Point-to-Point Integrations Integration Teams Have a Tendency
Scale Poorly to Use Too Much Tech
Most Used
Application
Application Bulk/Batch Data Movement
Application
Application Data Replication/Data Synchronization
Prevalence of Use
Application
Application Message-Oriented Movement of Data
Application
Data Virtualization
Application
Application
Stream Data Integration
Least Used
Point-to-Point Integrations Create (N*M) Connections
13 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The Why: D. Traditional Modeling Approaches: Don’t
Scale!
14 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The Why: E. Data Fabric Covers Both Transactional and
Analytical Systems
Transactional Analytical
Reporting
ETL ETL ETL
15 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
The How:
7 Steps to Stitch
Together Your Data
Fabric Design
16 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Data Fabrics Need to Be Designed — With Composable
Technology Parts
Data Consumers
Recommendation Engines
Data Preparation and Data
Data Delivery
Metadata Activation
Integration
Knowledge Graph
Data Sources
18 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Go Beyond Technical Metadata Collection
• Schemas • ETL or actions on data • Ontology — classify and • Business user knowledge
• Data types • Lineage metadata tag data • Developer feedback
• Data models • Performance metadata • Metadata mapped to
business relationships
19 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Remember: Passive Metadata = Design Metadata + Runtime
Metadata (Fabric Needs Both)
Processing Transaction
Code Logs
Design
Report Runtime
Query Logs
Labels/Tags
Runtime Resource
Org Chart Schema Design Specs Integrity
Statistics Allocation
When you compare design metadata to runtime metadata, the difference is rich with signals.
20 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 2: Create Knowledge Graphs
Source: Complexity Kills – How European Banking Models Have to Change in a Complex World
21 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
How to Develop Knowledge Graphs That Support the
Data Fabric Design
Populating a Knowledge Graph From Source Data
Mapping/Virtualization/API
Database Instance
Population Query
Ontology Knowledge Application
Graph (Minimum
Entity Ontology Viable
Extraction Mapping Product)
Documents/
Images Entities
• An ontology is a formal naming and definition of categories, properties and relations between
concepts.
• A knowledge graph is a data structure that stores entity definitions (like people, products and digital
assets) as a graph – a network of nodes and edges. Information/knowledge is located via an index
within the graph or synthesized as a data source on demand.
Source: Gartner
22 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Knowledge Graphs Map Data Objects to Business
Contexts (through Ontologies)
Semantics – The ability to capture ETL based data models had semantics In fact, semantics were captured in
meaning and relationships in data assigned by IT developers IT-aligned data dictionaries
23 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 3: When Exploring Data Fabric, Many Clients ask:
“What Is Active Metadata?”
Passive metadata approaches are like Active metadata approaches are like
a thermometer that “monitors” the data a thermostat that “regulates” data
management and current utilization. management and utilization.
24 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Start Activating Metadata Through Accumulation of
Trust
Start With Some Metadata — You Don’t Need It All
Alignment
Alerts Orchestration
Recommendations Preparation
Exception
First, collect design time Then do runtime log Compare design Send alerts when
metadata information with runtime things change
When data starts to have design or quality issues, it is a signal that the business is
changing and implies changes to the infrastructure might be needed.
25 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 4: Use Active Metadata-Based Insights for …
Insight
Engagement
• Anomaly and outlier detection
• Allow less-skilled integrators and • Highlighting of sensitive attributes for
SEMs to find data of interest through GDPR, etc.
semantics and search • Insights through observability
Automation
• Automated correction of schema drifts
• Autointegrate “next best” transforms
• Recommends optimal infrastructure/execution
engine choice
• Promote self-service integration flows
26 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 5: A Strong Data Integration and Data
Preparation Backbone Is a Must
27 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 6: Deliver Integrated Data for Self-Service – But
Monitor for Operationalization & Governance
Mode 2 Pipeline
Raw Data
A data fabric does not require data management optimization, but it does enable it — automation not required.
28 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 7: Utilize DataOps to Streamline Data Integration
Delivery
Data observability to track lineage, accuracy, freshness & pipeline breakages
*A Gartner client’s scenario
Python DataOps
Source 1 Qlik Snowflake =
Data Orchestration
dbt Labs +
Source 2 Talend Data Observability
AWS Glue +
Tableau Scheduling/Task Flow
+
Source 3 Informatica AWS S3 CI/CD
+
Collibra Version Control - Git/Jenkins
By 2025, a data engineering team guided by DataOps practices and tools will be 10x more productive than teams that are not.
29 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 8: Deliver Integrated Data as a Data Product (for
Mesh-Style Delivery) — When Ready!
Data Consumers Domain Team A
Data Sources
30 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 9 : Focus on the Right Teams and Skills
Business
Business Stakeholders
Stakeholders
Business Stakeholders
31 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Step 10: Provision Integrated Data As ‘Data Products’
Emerging
Trend
Templatize Provision
(DataOps capabilities)
• Data products are modular, loosely coupled and self-contained data apps that enable independent build/deployment cycles.
Usually controlled by SMEs at domain level.
• Drive targeted consumer experiences by templatizing D&A products and provisioning D&A product instances with agility and at scale.
32 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Bringing the Fabric to Life:
3 Paths to
Operationalizing Your
Data Fabric
33 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Quadrants of Awareness and Understanding
Example: Log analysis Example: Observability • The data fabric needs to support known
knowns
Things that we are aware Things that we understand • Finally, it needs to start automating to
of and understand but are unaware of improve the overall system performance
Example: Monitoring Example: Predictions
Knowns Unknowns
Awareness
34 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Path 1: The Foundational Path
Data Consumers
DataOps 6
Recommendation Engine 4
Data Integration and
Data Preparation
5 Active Metadata
Data Sources
35 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Path 2: The Advanced Path
Data Consumers
DataOps 6
Recommendation Engine 4
Data Integration and
Data Preparation
5 Active Metadata
Data Sources
36 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Path 3: The Automation Path
Data Consumers
DataOps 6
Recommendation Engine 4
Data Integration and
Data Preparation
5 Active Metadata
The Automation Path
Metadata Activation 3
Need for Automation?
Data Metadata
Components Used: 3 and 4,
along with other Knowledge Graph — Enriched With Semantics 2
components
Data and Metadata
Data Sources
37 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Key Issue Take-Away:
• Data fabric is a design concept.
• You cannot buy it!
• It needs to be designed with modular
technology components.
38 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
The Where:
Navigating
the Complex Vendor
Landscape
39 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Hype Cycle for Data Management 2022
Source: Hype Cycle for Data Management, 2022, 30 June 2022 (G00770739)
40 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Adoption Trends for Data Fabric
28%
Exploring
38%
Data Fabric Currently, about 1 in 4 organizations
Likely adop-
tion in the are pursuing (i.e., enterprise “intents”
next 2 years rather than actual deployments) a
fabric. Among these respondents, 3 in
5 are both exploring and adopting a
fabric in their D&A practices.
34%
Not actively
considering
n = 461
Source: Gartner
41 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Best-of-Breed vs. Best-Fit Engineering Tool Options
DataOps-Enabling Technology
Incumbent
Independent Vendors
Data Preparation Tools
or
or
Knowledge Graphs + Semantic Enrichment
Service Providers
That Use
Open Source Metadata Management Tools/Data Catalogs
+ Consulting
42 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Representative Vendors With Technology and
Capabilities That Support the Data Fabric Design 1
Alex Solutions Irion
Alteryx Neo4j
Augmented
Apache Software Nexla Data Catalogs Knowledge
Foundation Oracle Graph
Data
Capabilities
Ataccama Palantir Orchestration Supporting
Atlan and DataOps Data
SAP
Integration
Cambridge Semantics Semantic Web Company Stand-Alone Data
Cinchy Fabrics?
Stardog
CluedIn Data
Stratio Integration/ Semantics/
data.world TADA Preparation/ Data Modeling
Denodo Virtualization/ Tools
Talend Delivery AI/ML Toolkits
IBM TIBCO for Automation
Informatica
1
Representative list of vendors
43 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Understand That Data Fabrics Will Take Time to
Mature — But You Need to Start NOW!
Assembly Required You cannot buy data fabric! Develop it based on use cases, design and various tool(s).
This needs a central metadata practice. Orgs must incentivize teams to continually practice
Comprehensiveness of Input
metadata collection, enrichment and sharing
Knowledge graphs, nonrelational data stores, graph modeling (RDFs, etc.), query languages
Lack of Talent
(GraphQL). Do you have the talent?
Culture Existing culture is to centralize data and to throw traditional technology at problems.
Apply the benefits of data fabrics by experimenting, reimagining and discovering — as the innovation is at an early stage.
44 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Key Issue Take-Away:
Don’t look for data fabric “platforms” —
Instead look for composable, tightly
integrated, yet loosely coupled services
that share metadata.
45 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
What’s Next for Data Fabric:
2023 Strategic Planning Assumptions
46 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Appendix:
Case Studies
47 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376
Case Study 1: Jaguar Land Rover
Jaguar Land Rover saw that a connected view of its supply and demand data
enabled it to become efficient in answering critical business questions.
Examples of the two-way line of sight that enables exploration, discovery and
inference:
Only one
supplier provides this part Potential
alternative
suppliers
Access to Gartner research is subject to entitlement. For information, please contact your Gartner representative.
51 © 2023 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. 790376