Data Ops
Data Ops
© 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. This publication may not be reproduced or distributed in any form
without Gartner's prior written permission. It consists of the opinions of Gartner's research organization, which should not be construed as statements of fact. While the information contained in this
publication has been obtained from sources believed to be reliable, Gartner disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research
may address legal and financial issues, Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are
governed by Gartner’s Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its research organization without input or
influence from any third party. For further information, see "Guiding Principles on Independence and Objectivity."
Current State of Data and Analytics Delivery:
Agile Development + Fragile Operations
Inception Customer
Product Increments Value
2 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Key Issues
3 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data engineering is the discipline
of translating data into “usable forms”
4 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data Engineering Operational Complexity
Post-deployment
issues Design
issues
What
2. Doesn’t run 3. Doesn’t run
happened? 1. Doesn’t run
“good enough” “fast enough”
What’s the Wrong Wrong Config drift Data drift Data pipeline Need more
root cause? code/wrong environment issue compute?
data/wrong
schema
5 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
DataOps Adds “Software Mindset” to
Data Management
DataOps is an agile and collaborative data management practice focused on
improving the communication, integration, automation, observability and operations
of data flows between data managers and data consumers.
Agile
6 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
DataOps Helps You Achieve “Good”
Data Management Metrics
Business alignment
Code quality Service tickets
Productivity Data value gap
Time to use Data-as-a-product
Self-service Release velocity
Process management
Collaboration Reuse
7 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data Engineering Is a Critical Skill
With High Demand High SQL
Java
Python
DevOps
Automation
CI/CD Kubernetes
Cloud Environments
Core Critical
Data
Data Engineering Engineering
8 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data Engineer’s Key Activities
Emerging
Trend From provisioning servers, automated
Study
Usage data ingestion, storage and scheduling
Patterns of pipelines, to self-healing pipelines
Manage and dynamic workload management …
Metadata
Automate!
Build Data
BuildPipelines
Data
rd
2/3 Pipelines
of the time
Support
Data Science All recurring patterns can be
templatized; but not all has value.
Collaborate Metadata analysis can guide
Across Drive
Business and IT Automation you in picking the right use cases
for automation.
Critical
Activity
9 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data Engineer’s Top 3 Challenges
10 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
A Representative DataOps Team
11 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Key Issues
12 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
No. 1: Product Delivery Mindset: Replace
Monolithic Practices With Modularity
Emerging
trend Templatize Provision
(DevOps capabilities)
Data store
Catalog D&A Products • Infrastructure as code
… • Access control
1 2 n Data pipelines
• Version control
• Continuous
Analytic model integration/deployment
Data and Analytics Platforms • Regression test packs
User interface
13 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
No. 2: Automation Mindset: Pick the Right Use
Cases for Automation by Metadata Analysis
Iterative Resource-intensive
Raw
Raw
Raw Ingest Explore Model Curate Catalog Optimized
data
data
Data Data
Data at Data in Motion Data at
Rest (Detokenized/Deidentified appropriately) Rest/Use
(Encrypted) (Encrypted)
Metadata drives pipeline patterns. This is a current trend, e.g., data warehouse automation tools.
Contextualize data better by studying consumption patterns by users and systems.
Data observability is an emerging trend in this context. Active metadata forms the basis
for the data fabric design.
14 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Automate Testing and Release Processes:
Continuous Integration Pipelines
Developer
Commits
CI Pipeline
Run
Run Run Deploy to Advanced Tests,
Automated
Build Scripts* Unit Tests Test Env. Release Process
Tests
Build Unit test Deployment Regression
failed failed failed test failed
CI = Continuous Integration
Scripts* = DDL/DML, pipeline, metadata, operations config, etc.
Source: Data and Analytics Essentials: DataOps (G00767464)
15 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
No. 3: Ops Enablement Mindset: Agile Practices
When Empowering Citizen Roles
Start here Ops overhead
16 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Test Upfront for Feasibility and Business Value
Source: 3 Case Studies of Data and Analytics Driven Business Innovation (G00751851)
* Pseudonym
17 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Key Issues
18 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data Engineering Stretches Beyond the Core
Data Management Practices
Source: How to Build a Data Engineering Practice That Delivers Great Consumer Experiences (G00741778)
19 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
NIKE Acquires Datalogue to Add
“Software Mindset” to Data Management
NIKE acquired a startup based in New York to enable its digital transformation.
NEWS Datalogue had a proprietary machine learning technology that automates data
preparation and integration.
— February 2021
20 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Hub-Spoke Operating Model: Formalize and Scale
Marketing Finance
Central
D&A Team
Supply Chain Data Science Lab
The central team establishes your franchise (processes, capabilities, best practices) so
that your brand remains consistent. The satellite teams within the departments adapt to
local environments (data, people).
21 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Data Engineering Activities Vary Across
Central and Departmental Teams
5% 5% 15%
5%
Data Management
10% 10%
10% 40% 20%
Central
70%
80% D&A Team
Supply Chain Data Science Lab
Software engineering and I&O tasks will be higher in the central team. While the core data
tasks will be higher in the departmental teams with minimal platform operations.
22 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Draft and Improvise Responsibilities Distribution
23 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Recommendations
24 © 2022 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates.
Recommended Gartner Research