0% found this document useful (0 votes)

64 views15 pages

White Paper - DataOps Is NOT DevOps For Data

Uploaded by

Reza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

64 views15 pages

White Paper - DataOps Is NOT DevOps For Data

Uploaded by

Reza

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 15

DataOps is NOT Just

DevOps for Data

Figure 1: DevOps is often depicted as an infinite loop,
while DataOps is illustrated as intersecting Value and Innovation Pipelines

One common misconception about DataOps is that it is just DevOps

applied to data analytics. While a little semantically misleading, the name
“DataOps” has one positive attribute. It communicates that
data analytics can achieve what software development attained with
DevOps. That is to say, DataOps can yield an order of magnitude
improvement in quality and cycle time when data teams utilize new tools and
methodologies. The specific ways that DataOps achieves these gains reflect the
unique people, processes and tools characteristic of data teams (versus
software development teams using DevOps). Here’s our in-depth take on
both the pronounced and subtle differences between DataOps and
DevOps.

The Intellectual Heritage of DataOps

DevOps is an approach to software development that accelerates the build
lifecycle (formerly known as release engineering) using automation. DevOps
focuses on continuous integration and continuous delivery of software by
leveraging on-demand IT resources (infrastructure as code) and by
automating integration, test and deployment of code. This merging of
software development and IT operations (“DEVelopment” and “OPerationS”)
reduces time to deployment, decreases time to market, minimizes defects,
and shortens the time required to resolve issues.

Using DevOps, leading companies have been able to reduce their

software release cycle time from months to (literally) seconds. This has
enabled them to grow and lead in fast-paced, emerging markets.
Companies like Google, Amazon and many others now release software many
times per day. By improving the quality and cycle time of code releases,
DevOps deserves a lot of credit for these companies’ success.

2 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

Optimizing code builds and delivery is only one piece of the larger puzzle for
data analytics. DataOps seeks to reduce the end-to-end cycle time of data
analytics, from the origin of ideas to the literal creation of charts, graphs
and models that create value. The data lifecycle relies upon people in
addition to tools. For DataOps to be effective, it must manage collaboration
and innovation. To this end, DataOps introduces Agile Development into
data analytics so that data teams and users work together more efficiently
and effectively.

In Agile Development, the data team publishes new or updated analytics in

short increments called “sprints.” With innovation occurring in rapid
intervals, the team can continuously reassess its priorities and more easily
adapt to evolving requirements. This type of responsiveness is
impossible using a Waterfall project management methodology which locks
a team into a long development cycle with one “big-bang” deliverable at the
end.
Studies show that Agile
software development projects
complete faster and with fewer
defects when Agile
Development replaces the
traditional Waterfall sequential
methodology. The Agile
methodology is particularly
effective in environments
where requirements are
quickly evolving — a situation
well known to data analytics
professionals. In a DataOps
setting, Agile methods enable
organizations to respond
quickly to customer Figure 2: The intellectual heritage of DataOps.

requirements and accelerate

time to value.
Agile development and DevOps add significant value to data
analytics, but there is one more major component to DataOps.
Whereas Agile and DevOps relate to analytics development and
deployment, data analytics also manages and orchestrates a data
pipeline. Data continuously enters on one side of the pipeline,
progresses through a series of steps and exits in the form of reports,
models and views. The data pipeline is the “operations” side of data
analytics. It is helpful to conceptualize the data pipeline as a
manufacturing line where quality, efficiency, constraints and uptime must
be managed. To fully embrace this manufacturing mindset, we call this
pipeline the “data factory.

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 3

In DataOps, the flow of data through operations is an important area of focus.
DataOps orchestrates, monitors and manages the data factory. One
particularly powerful lean-manufacturing tool is statistical process control
(SPC). SPC measures and monitors data and operational characteristics of the
data pipeline, ensuring that statistics remain within acceptable ranges. When
SPC is applied to data analytics, it leads to remarkable improvements in
efficiency, quality and transparency. With SPC in place, the data flowing
through the operational system is verified to be working. If an anomaly occurs,
the data analytics team will be the first to know, through an automated alert.

While the name “DataOps” implies that it borrows most heavily from DevOps, it
is all three of these methodologies — Agile, DevOps and statistical process
control — that comprise the intellectual heritage of DataOps. Agile governs
analytics development, DevOps optimizes code verification, builds and delivery
of new analytics and SPC orchestrates and monitors the data factory. Figure 2
illustrates how Agile, DevOps and statistical process control flow into DataOps.

You can view DataOps in the context of a century-long evolution of ideas that
improve how people manage complex systems. It started with pioneers like
Deming and statistical process control — gradually these ideas crossed into the
technology space in the form of Agile, DevOps and now, DataOps.

DevOps vs. DataOps —

The Human Factor
As mentioned above, DataOps is as much about managing people as it is about
tools. One subtle difference between DataOps and DevOps relates to the needs
and preferences of stakeholders.

Figure 3: DataOps and DevOps users have different mindsets

DevOps was created to serve the needs of software developers. Dev engineers
love coding and embrace technology. The requirement to learn a new
language or deploy a new tool is an opportunity, not a hassle. They take a
professional interest in all the minute details of code creation, integration and
deployment. DevOps embraces complexity.

4 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

DataOps users are often the opposite of that. They are data scientists or
analysts who are focused on building and deploying models and visualizations.
Scientists and analysts are typically not as technically savvy as engineers. They
focus on domain expertise. They are interested in getting models to be more
predictive or deciding how to best visually render data. The technology used to
create these models and visualizations is just a means to an end. Data
professionals are happiest using one or two tools — anything beyond that adds
unwelcome complexity. In extreme cases, the complexity grows beyond their
ability to manage it. DataOps accepts that data professionals live in a multi-
tool, heterogeneous world and it seeks to make that world more manageable
for them.

DevOps vs. DataOps —

Process Differences
We can begin to understand the unique complexity facing data professionals
by looking at data analytics development and lifecycle processes. We find that
data analytics professionals face challenges both similar and unique relative to
software developers.

The DevOps lifecycle is commonly illustrated using a diagram in the shape of

an infinite symbol — See Figure 4. The end of the cycle (“plan”) feeds back to the
beginning (“create”), and the process iterates indefinitely.

Figure 4: The DevOps lifecycle is often depicted as an infinite loop

The DataOps lifecycle shares these iterative properties, but an important

difference is that DataOps consists of two active and intersecting pipelines
(Figure 5). The data factory, described above, is one pipeline. The other
pipeline governs how the data factory is updated — the creation and
deployment of new analytics into the data pipeline.

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 5

The data factory takes raw data sources as input and through a series
of orchestrated steps produces analytic insights that create “value” for
the organization. We call this the “Value Pipeline.” DataOps
automates orchestration and, using SPC, monitors the quality of data flowing
through the Value Pipeline.

The “Innovation Pipeline” is the process by which new analytic ideas

are introduced into the Value Pipeline. The Innovation Pipeline
conceptually resembles a DevOps development process, but upon closer
examination, several factors make the DataOps development process more
challenging than DevOps. Figure 5 shows a simplified view of the Value and
Innovation Pipelines.

Figure 5: The DataOps lifecycle — the Value and Innovation Pipelines

DevOps vs. DataOps —

Development & Deployment Processes
DataOps builds upon the DevOps development model. As shown in Figure 6, the
DevOps process flow includes a series of steps that are common to software
development projects:

• Develop – create/modify an application

• Build – assemble application components
• Test – verify the application in a test environment
• Deploy – transition code into production
• Run – execute the application

DevOps introduces two foundational concepts: Continuous Integration (CI) and

Continuous Deployment (CD). CI continuously builds, integrates and tests
new code in a development environment. Build and test are automated so
they can occur rapidly and repeatedly. This allows issues to be identified and
resolved quickly. Figure 6 illustrates how CI encompasses the build and
test process stages of DevOps.

6 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

Figure 6: Comparing the DataOps and DevOps processes

CD is an automated approach to deploying or delivering software. Once an

application passes all qualification tests, DevOps deploys it into production.
Together CI and CD resolve the main constraint hampering Agile development.
Before DevOps, Agile created a rapid succession of updates and innovations
that would stall in a manual integration and deployment process. With
automated CI and CD, DevOps has enabled companies to update their
software many times per day.

The Duality of Orchestration in DataOps

It’s important to note that “orchestration” occurs twice in the DataOps process
shown in Figure 6. As we explained above, DataOps orchestrates the data
factory (the Value Pipeline). The data factory consists of a pipeline process with
many steps. Imagine a complex directed acyclic graph (DAG). The
“orchestrator” could be a software entity which controls the execution of the
steps, traverses the DAG, and handles exceptions. For example, the
orchestrator might create containers, invoke runtime processes with context-
sensitive parameters, transfer data from stage to stage, and “monitor” pipeline
execution. Orchestration of the data factory is the second “orchestration” in the
DataOps process in Figure 7.

Figure 7: DataOps orchestrates the data factory.

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 7

As noted above, the Innovation Pipeline has a representative copy of the
data pipeline which is used to test and verify new analytics before
deployment into production. This is the orchestration that occurs in
conjunction with “testing” and prior to “deployment” of new analytics — as
shown in Figure 8.

Orchestration occurs in both the Value and Innovation Pipelines. Similarly,

testing fulfills a dual role in DataOps.

Figure 8: DataOps orchestration controls the numerous tools that

access, transform, model, visualize and report data.

The Duality of Testing in DataOps

Tests in DataOps have a role in both the Value and Innovation Pipelines. In the
Value Pipeline, tests monitor the data values flowing through the data factory to
catch anomalies or flag data values outside statistical norms. In the Innovation
Pipeline, tests validate new analytics before deploying them.

In DataOps, tests target either data or code. In a recent blog, we discussed this
concept using Figure 9. Data that flows through the Value Pipeline is variable
and subject to statistical process control and monitoring. Tests target the data
which is continuously changing. Analytics in the Value Pipeline, on the other
hand, are fixed and change only using a formal release process. In the Value
Pipeline, analytics are revision controlled to minimize any disruptions in service
that could affect the data factory.

In the Innovation Pipeline code is variable and data is fixed. The analytics are
revised and updated until complete. Once the sandbox is set-up, the data
doesn’t usually change. In the Innovation Pipeline, tests target the code
(analytics), not the data.

8 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

All tests must pass before promoting (merging) new code into production. A
good test suite serves as an automated form of impact analysis that runs on any
and every code change before deployment.

Some tests are aimed at both data and code. For example, a test that makes
sure that a database has the right number of rows helps your data and code
work together. Ultimately both data tests and code tests need to come together
in an integrated pipeline as shown in Figure 5. DataOps enables code and data
tests to work together so all around quality remains high.

Figure 9: In DataOps, analytics quality is a function of data and code testing

DataOps Complexity —
Sandbox Management
When an engineer joins a software development team, one of their first steps
is to create a “sandbox.” A sandbox is an isolated development environment
where the engineer can write and test new application features, without
impacting teammates who are developing other features in parallel. Sandbox
creation in software development is typically straightforward — the engineer
usually receives a bunch of scripts from teammates and can configure a
sandbox in a day or two. This is the typical mindset of a team using DevOps.

Sandboxes in data analytics are often more challenging from a tools and data
perspective. First of all, data teams collectively tend to use many more tools
than typical software dev teams. There are literally thousands of tools,
languages and vendors for data engineering, data science, BI, data
visualization, and governance. Without the centralization that is characteristic
of most software development teams, data teams tend to naturally diverge
with different tools and data islands scattered across the enterprise.

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 9

Figure 10: A “sandbox” is an isolated development environment where the data
professional can write and test new analytics without impacting teammates.

DataOps Complexity —
Test Data Management
In order to create a dev environment for analytics, you have to create a copy of
the data factory. This requires the data professional to replicate data which
may have security, governance or licensing restrictions. It may be impractical
or expensive to copy the entire data set, so some thought and care is required
to construct a representative data set. Once a multi-terabyte data set is
sampled or filtered, it may have to be cleaned or redacted (have sensitive
information removed). The data also requires infrastructure which may not be
easy to replicate due to technical obstacles or license restrictions.

Figure 11: The concept of test data management is a first order problem in DataOps.

The concept of test data management is a first order problem in DataOps

whereas in most DevOps environments, it is an afterthought. To accelerate
analytics development, DataOps has to automate the creation of development
environments with the needed data, software, hardware and libraries so
innovation keeps pace with Agile iterations.

10 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

DataOps Connects the
Organization in Two Ways
DevOps strives to help development and operations (information technology)
teams work together in an integrated fashion. In DataOps, this concept is
depicted in Figure 12. The development team are the analysts, scientists,
engineers, architects and others who create data warehouses and analytics.

In data analytics, the operations team supports and monitors the data pipeline.
This can be IT, but it also includes customers — the users who create and
consume analytics. DataOps brings these groups together so they can work
together more closely.

Figure 12: DataOps combines data analytics development and data operations.

Freedom vs. Centralization

DataOps also brings the organization together across another dimension. A
great deal of data analytics development occurs in remote corners of the
enterprise, close to business units, using self-service tools like Tableau, Alteryx,
or Excel. These local teams, engaged in decentralized, distributed analytics
creation play an essential role in delivering innovation to users. Empowering
these pockets of creativity maintains the enterprise’s competitiveness, but
frankly, a lack of top-down control can lead to unmanaged chaos.

Centralizing analytics development under the control of one group, such as IT,
enables the organization to standardize metrics, control data quality, enforce
security and governance, and eliminate islands of data. The issue is that too
much centralization chokes creativity.

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 11

One important benefit of
DataOps is its ability to
harmonize the back-and-
forth between the
decentralized and centralized
development of data
analytics — the tension
between centralization and
freedom.

In a DataOps enterprise,
new analytics originate and
undergo refinement in the
local pockets of innovation.
When an idea proves useful
Figure 13: DataOps brings together
centralized and distributed development or is worthy of wider
distribution, it is promoted to
a centralized development
group who can more
efficiently and robustly
implement it at scale.

DataOps brings localized and centralized development together enabling

organizations to reap the efficiencies of centralization while preserving localized
development — the tip of the innovation spear. DataOps brings the enterprise
together across two dimensions as shown in Figure 14 — development/operations
as well as distributed/centralized development.

DataOps brings three

cycles of innovation
between core groups in
the organization:
centralized production
teams, centralized data
engineering/analytics/
science/governance
development teams, and
groups using self-service
tools distributed into the
lines business closest to
Figure 14: DataOps brings teams together across two
the customer. Figure 15 dimensions — development/operations as well as distributed/
centralized development.
shows the interlocking
cycles of innovation.

12 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

Figure 15: DataOps brings three cycles of innovation between production, central data, and self-service teams.

Enterprise Example —
Data Analytics Lifecycle Complexity
Having examined the DataOps development process The challenge of pushing analytics into production
at a high level, let’s look at the development lifecycle across these four quite different environments is
in the enterprise context. Figure 15 illustrates the daunting without DataOps. It requires a patchwork
complexity of analytics progression from inception to of manual operations and scripts that are in
production. Analytics are first created and themselves complex to manage. Human processes
developed by an individual and then merged into a are error-prone so data professionals compensate
team project. After completing unit acceptance by working long hours, mistakenly relying on hope
testing (UAT), analytics move into production. The and heroism for success. All of this results in
goal of DataOps is to create analytics in the unnecessary complexity, confusion and a great
individual development environment, advance into deal of wasted time and energy. Slow progression
production, receive feedback from through the lifecycle shown in Figure 16 coupled
users and then continuously improve through with high-severity errors finding their way into
further iterations. This can be challenging due to production can leave a data analytics team little
the differences in personnel, tools, code, versions, time for innovation.
manual procedures/automation, hardware,
operating systems/libraries and target data. The
columns in Figure 16 show the varied characteristics
for each of these four environments.

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 13

Figure 16: Data Analytics Development Lifecycle Complexities Implementing DataOps

DataOps simplifies the complexity of data A DataOps Platform automates the steps and
analytics creation and operations. It aligns data processes that comprise DataOps: sandbox
analytics development with user priorities. It management, orchestration, monitoring, testing,
streamlines and automates the analytics deployment, the data factory, dashboards, Agile,
development lifecycle — from the creation of and more. A DataOps Platform is built for data
sandboxes to deployment. DataOps controls and professionals with the goal of simplifying all of the
monitors the data factory so data quality remains tools, steps and processes that they need into an
high, keeping the data team focused on adding easy-to-use, configurable, end-to-end system. This
value. high degree of automation eliminates a great deal
of manual work, freeing up the team to create new
You can get started with DataOps by and innovative analytics that maximize the value of
implementing these seven steps. You can also an organization’s data.
adopt a DataOps Platform which will support
DataOps methods within the context of your
existing tools and infrastructure.

14 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

Learn More About DataOps
For more information about DataOps please refer to datakitchen.io.

About DataKitchen
DataKitchen, Inc. helps organizations turns data into value by offering
the world’s first DataOps platform. With DataKitchen, data and
analytic teams can orchestrate data to value and deploy features into
production while automating quality. These teams benefit from
delivering value quickly, with high quality, using the tools that they
love. DataKitchen is leading the DataOps movement to incorporate
Agile Software Development, DevOps, and manufacturing-based
statistical process control into analytics and data management.
DataKitchen is headquartered in Cambridge, Massachusetts.

© 2021 DataKitchen, Inc. All Rights Reserved. The information in this document is subject to
change without notice and should not be construed as a commitment by DataKitchen. While
reasonable precautions have been taken, DataKitchen assumes no responsibility for any errors
that may appear in this document. All products shown or mentioned are trademarks or registered
trademarks of their respective owners. | 910510A

DK Dataops Book 2nd Edition
100% (2)
DK Dataops Book 2nd Edition
189 pages
DevOps Report
100% (1)
DevOps Report
22 pages
DataKitchen Dataops Cookbook
100% (3)
DataKitchen Dataops Cookbook
142 pages
Devops With BTP
No ratings yet
Devops With BTP
27 pages
Microservice Architecture Tutorial
100% (5)
Microservice Architecture Tutorial
47 pages
DevOps Essential 2
100% (2)
DevOps Essential 2
122 pages
DevOps Tutorial
100% (1)
DevOps Tutorial
17 pages
Tamr EB Getting DataOps Right Full 05-23-19
100% (1)
Tamr EB Getting DataOps Right Full 05-23-19
66 pages
Reed - Mark DevOps - The Ultimate Beginners Guide To Learn DevOps Step by Step - 2020 - Publishing Facto
100% (1)
Reed - Mark DevOps - The Ultimate Beginners Guide To Learn DevOps Step by Step - 2020 - Publishing Facto
87 pages
Introducing DataOps Into Your Data Management Discipline - 376495
No ratings yet
Introducing DataOps Into Your Data Management Discipline - 376495
10 pages
DevOps UNIT3
No ratings yet
DevOps UNIT3
25 pages
The Guide To Big Data Powered BI and Analytics With DevOps - Narwal
No ratings yet
The Guide To Big Data Powered BI and Analytics With DevOps - Narwal
3 pages
Axis DevSecOps Training Batch-5
100% (1)
Axis DevSecOps Training Batch-5
71 pages
Implementing Various Systems With DevOps To Make Successful Decisions Based On Intelligent Learning Strategy
No ratings yet
Implementing Various Systems With DevOps To Make Successful Decisions Based On Intelligent Learning Strategy
6 pages
AEB-1184 DataOps Flipbook v2.4.2b
100% (1)
AEB-1184 DataOps Flipbook v2.4.2b
13 pages
SQL Injection
No ratings yet
SQL Injection
32 pages
The Essential Guide To DataOps
100% (1)
The Essential Guide To DataOps
16 pages
Unit 1
No ratings yet
Unit 1
35 pages
What Is Devops?
No ratings yet
What Is Devops?
8 pages
Data-Kitchen WP 7steps 710816A LR
No ratings yet
Data-Kitchen WP 7steps 710816A LR
8 pages
From AdHoc Data Analytics To DataOps - 2020 - Association For Computing Machinery Inc
No ratings yet
From AdHoc Data Analytics To DataOps - 2020 - Association For Computing Machinery Inc
10 pages
Atlan - Data Management Report
No ratings yet
Atlan - Data Management Report
14 pages
Unit 1 Devops Notes
No ratings yet
Unit 1 Devops Notes
23 pages
DevOps Report
No ratings yet
DevOps Report
22 pages
Devops Notes
No ratings yet
Devops Notes
17 pages
DataKitchen 7 Steps White Paper
No ratings yet
DataKitchen 7 Steps White Paper
6 pages
Ultimate Guide To Data Ops
No ratings yet
Ultimate Guide To Data Ops
19 pages
DataOps Is More Than Devops For Data, Delphix CTO Says
No ratings yet
DataOps Is More Than Devops For Data, Delphix CTO Says
2 pages
Unit 2 Decops Lifecycle
No ratings yet
Unit 2 Decops Lifecycle
37 pages
Online Assets Hitachi Vantara DataOps Unlocks Value of Data
No ratings yet
Online Assets Hitachi Vantara DataOps Unlocks Value of Data
15 pages
The Road To Devops Success 1
No ratings yet
The Road To Devops Success 1
5 pages
What Is DataOps - The Ultimate DataOps Guide by Rivery
No ratings yet
What Is DataOps - The Ultimate DataOps Guide by Rivery
11 pages
DevOps Reading
No ratings yet
DevOps Reading
4 pages
The Road To Devops Success: Achieve Ci/Cd With Machine Data
No ratings yet
The Road To Devops Success: Achieve Ci/Cd With Machine Data
5 pages
Unit 3 Devops
No ratings yet
Unit 3 Devops
25 pages
BG - 0618 - Application Development and DevOps
No ratings yet
BG - 0618 - Application Development and DevOps
14 pages
IBM 2553-DataOps - Whitepaper.Update-RGB-V1 1
No ratings yet
IBM 2553-DataOps - Whitepaper.Update-RGB-V1 1
13 pages
DevOps and Data Engineering A Synergistic Overview
No ratings yet
DevOps and Data Engineering A Synergistic Overview
10 pages
Seminar Documentation (6750) 1
No ratings yet
Seminar Documentation (6750) 1
19 pages
What Is Devops?
No ratings yet
What Is Devops?
8 pages
5.chap 1
No ratings yet
5.chap 1
14 pages
Lec 1
No ratings yet
Lec 1
52 pages
Trends in Dataops: Bringing Scale and Rigor To Data and Analytics
No ratings yet
Trends in Dataops: Bringing Scale and Rigor To Data and Analytics
22 pages
ScilabTec Xcos
No ratings yet
ScilabTec Xcos
31 pages
DataOps and The Future of Management
No ratings yet
DataOps and The Future of Management
8 pages
Are You A DevOps Hero Whitepaper
No ratings yet
Are You A DevOps Hero Whitepaper
8 pages
Me 3BTnaTiCv9wU52h4gdA Chamillard-C-Unity-Book
No ratings yet
Me 3BTnaTiCv9wU52h4gdA Chamillard-C-Unity-Book
509 pages
C++ The Good, Bad, and Ugly
No ratings yet
C++ The Good, Bad, and Ugly
29 pages
UGRD-CS6209 Software Engineering 1 Prelim Exam
No ratings yet
UGRD-CS6209 Software Engineering 1 Prelim Exam
5 pages
Software Tester Elite 2nd Edition
100% (1)
Software Tester Elite 2nd Edition
291 pages
BS23 - GLassDoor
No ratings yet
BS23 - GLassDoor
4 pages
Code Coverage Testing: 1. Statement Coverage (Line Coverage)
No ratings yet
Code Coverage Testing: 1. Statement Coverage (Line Coverage)
17 pages
Getting Started Guide: Alfresco Content Services 6.0
No ratings yet
Getting Started Guide: Alfresco Content Services 6.0
14 pages
SchneiderElectric AltivarMachine ATV320 DTM Library v1.1.9 ReleaseNotes
No ratings yet
SchneiderElectric AltivarMachine ATV320 DTM Library v1.1.9 ReleaseNotes
3 pages
Python - Session1
No ratings yet
Python - Session1
170 pages
OpenSAP Hanasql1 Week 1 Transcript en
No ratings yet
OpenSAP Hanasql1 Week 1 Transcript en
12 pages
File Upload Application Asp
No ratings yet
File Upload Application Asp
6 pages
Spyder-Ide - Spyder-Notebook - Jupyter Notebook Integration With Spyder
No ratings yet
Spyder-Ide - Spyder-Notebook - Jupyter Notebook Integration With Spyder
6 pages
Elance - Java Developers 2
No ratings yet
Elance - Java Developers 2
15 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Automatic Pricing
No ratings yet
Automatic Pricing
9 pages
Functions in C (Examples and Practice)
No ratings yet
Functions in C (Examples and Practice)
6 pages
Quiz3 - Take
No ratings yet
Quiz3 - Take
4 pages
Scrum Framework Student Book Edition 3
No ratings yet
Scrum Framework Student Book Edition 3
42 pages
GC 2024 12 25
No ratings yet
GC 2024 12 25
7 pages
Sogeti Neilsquest Preview PDF
No ratings yet
Sogeti Neilsquest Preview PDF
44 pages
1 TheorieLangage Compilation
No ratings yet
1 TheorieLangage Compilation
19 pages
Release Notes
No ratings yet
Release Notes
21 pages
Computer Language-1: Exception Handling
No ratings yet
Computer Language-1: Exception Handling
53 pages
Brand Wick
No ratings yet
Brand Wick
10 pages
17CS2014 - Java Programming Lab URK17CS045: Ex. No. 6 Abstract Classes and Objects
No ratings yet
17CS2014 - Java Programming Lab URK17CS045: Ex. No. 6 Abstract Classes and Objects
8 pages
W3schools Comjgkgfkks
No ratings yet
W3schools Comjgkgfkks
5 pages
Maven "Convention Over Configuration" Example: An Illustration of This Notion Inside The Maven
No ratings yet
Maven "Convention Over Configuration" Example: An Illustration of This Notion Inside The Maven
2 pages
DevOps for the Modern Enterprise: Winning Practices to Transform Legacy IT Organizations
From Everand
DevOps for the Modern Enterprise: Winning Practices to Transform Legacy IT Organizations
Mirco Hering
No ratings yet
Mainframe Mastery with DevOps: Integrating Legacy Systems with Agile Practices: Mainframes
From Everand
Mainframe Mastery with DevOps: Integrating Legacy Systems with Agile Practices: Mainframes
Ricardo Nuqui
No ratings yet
30 Days to DevOps Proficiency
From Everand
30 Days to DevOps Proficiency
Prachi Tembhekar
No ratings yet
DevOps Beginners to Advanced with Projects
From Everand
DevOps Beginners to Advanced with Projects
Adil Khan
No ratings yet
DevOps Interview Questions
From Everand
DevOps Interview Questions
Tech Interviews
4.5/5 (10)
DevOps Engineer's Guidebook: Essential Techniques
From Everand
DevOps Engineer's Guidebook: Essential Techniques
Ted Noreux
No ratings yet
Study Guide Implementing DevOps Solutions (DevNet Professional) 300-910 DEVOPS
From Everand
Study Guide Implementing DevOps Solutions (DevNet Professional) 300-910 DEVOPS
Anand Vemula
No ratings yet
DevOps: Introduction to DevOps and its impact on Business Ecosystem: Introduction to DevOps and its impact on Business Ecosystem
From Everand
DevOps: Introduction to DevOps and its impact on Business Ecosystem: Introduction to DevOps and its impact on Business Ecosystem
Stephen Fleming
No ratings yet
Accelerated DevOps with AI, ML & RPA: Non-Programmer’s Guide to AIOPS & MLOPS
From Everand
Accelerated DevOps with AI, ML & RPA: Non-Programmer’s Guide to AIOPS & MLOPS
Stephen Fleming
5/5 (2)
DevOps Revolution: Transforming Software Delivery for High-Performance Teams
From Everand
DevOps Revolution: Transforming Software Delivery for High-Performance Teams
Ryan Campbell
No ratings yet
DevOps and Microservices: Non-Programmer's Guide to DevOps and Microservices
From Everand
DevOps and Microservices: Non-Programmer's Guide to DevOps and Microservices
Stephen Fleming
4/5 (2)
DevOps Adoption: How to Build a DevOps IT Environment and Kickstart Your Digital Transformation
From Everand
DevOps Adoption: How to Build a DevOps IT Environment and Kickstart Your Digital Transformation
Frank Millstein
4.5/5 (3)
Sqoop Essentials: Definitive Reference for Developers and Engineers
From Everand
Sqoop Essentials: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
DevOps Handbook: DevOps eBook for IT Professionals
From Everand
DevOps Handbook: DevOps eBook for IT Professionals
Poonam Devi
No ratings yet
Oracle Quick Guides: Part 2 - Oracle Database Design
From Everand
Oracle Quick Guides: Part 2 - Oracle Database Design
Malcolm Coxall
No ratings yet
The Anatomy of DevOps
From Everand
The Anatomy of DevOps
Tom Henricksen
No ratings yet
DevOps Basics, Principles, and More
From Everand
DevOps Basics, Principles, and More
Tom Henricksen
No ratings yet

White Paper - DataOps Is NOT DevOps For Data

Uploaded by

White Paper - DataOps Is NOT DevOps For Data

Uploaded by

DataOps is NOT Just

DevOps for Data

One common misconception about DataOps is that it is just DevOps

The Intellectual Heritage of DataOps

Using DevOps, leading companies have been able to reduce their

2 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

In Agile Development, the data team publishes new or updated analytics in

requirements and accelerate

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 3

DevOps vs. DataOps —

Figure 3: DataOps and DevOps users have different mindsets

4 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

DevOps vs. DataOps —

The DevOps lifecycle is commonly illustrated using a diagram in the shape of

Figure 4: The DevOps lifecycle is often depicted as an infinite loop

The DataOps lifecycle shares these iterative properties, but an important

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 5

The “Innovation Pipeline” is the process by which new analytic ideas

Figure 5: The DataOps lifecycle — the Value and Innovation Pipelines

DevOps vs. DataOps —

• Develop – create/modify an application

DevOps introduces two foundational concepts: Continuous Integration (CI) and

6 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

CD is an automated approach to deploying or delivering software. Once an

The Duality of Orchestration in DataOps

Figure 7: DataOps orchestrates the data factory.

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 7

Orchestration occurs in both the Value and Innovation Pipelines. Similarly,

Figure 8: DataOps orchestration controls the numerous tools that

The Duality of Testing in DataOps

8 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

Figure 9: In DataOps, analytics quality is a function of data and code testing

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 9

The concept of test data management is a first order problem in DataOps

10 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

Freedom vs. Centralization

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 11

DataOps brings localized and centralized development together enabling

DataOps brings three

12 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO 13

14 WHITEPAPER | DATAOPS IS NOT JUST DEVOPS FOR DATA | DATAKITCHEN.IO

You might also like