0% found this document useful (0 votes)
93 views2 pages

Gartner Reprint Observability

The Market Guide for Data Observability Tools outlines the importance of data observability in managing complex data ecosystems, highlighting the limitations of traditional monitoring methods. It categorizes tools into embedded and stand-alone types, recommends strategies for implementation, and predicts significant growth in adoption by 2026. The document emphasizes the need for organizations to enhance data visibility and quality to support emerging technologies like generative AI.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
93 views2 pages

Gartner Reprint Observability

The Market Guide for Data Observability Tools outlines the importance of data observability in managing complex data ecosystems, highlighting the limitations of traditional monitoring methods. It categorizes tools into embedded and stand-alone types, recommends strategies for implementation, and predicts significant growth in adoption by 2026. The document emphasizes the need for organizations to enhance data visibility and quality to support emerging technologies like generative AI.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Licensed for Distribution

Market Guide for Data Observability


Tools
25 June 2024 - ID G00765184 - 20 min read

By Melody Chien, Jason Medd, and 2 more

Data observability tools emerged as a market to offer a robust and holistic view of the health of data, enabling
users to observe changes, discover unknowns and take actions as needed. This research outlines how data and
analytics leaders can best leverage the data observability market’s offerings.

Overview

Key Findings
Static, event-based monitoring is no longer sufficient to effectively manage data systems and prevent critical
events. Modern data architectures are too complex and dynamic for those methods to provide a holistic view
of data’s health across data ecosystems at various stages of its life cycle.

Vendors are offering a range of different capabilities branded as data observability, causing confusion in the
market and tool adoption issues due to the lack of a standard accepted definition.

Although the current data observability market focuses on data content, as well as data flow and pipeline,
Gartner sees a strong opportunity for vendors to expand their coverage in observation areas and supported
data environments.

There are two types of data observability tools in the market: embedded tools and stand-alone tools.
Embedded tools are easier to implement but have limited coverage, whereas stand-alone tools have vast
capabilities but require more rigorous adaptation process and technology optimization.

Recommendations
D&A leaders assessing the data observability market must:

Identify gaps in the current data ecosystem, and use them as opportunities for piloting data observability
tools. Areas where current monitoring cannot sufficiently identify critical issues, SLAs are not met or
delivering data requires long-running tasks are ideal piloting opportunities for testing data observability tools.

Evaluate vendors with both technology and business in mind. Engage both business and technical personas
early in the vendor evaluation process to evaluate a data observability tool based on the business and
enterprise ecosystem’s requirements.

Prioritize piloting data observability tools with minimal technology requirements. Target cloud environments
first (which are the primary focus among vendors and allow easier data observability tool implementations) to
quickly assess the improvement of data quality within data pipelines and the impact on business outcomes.

Secure business value from data observability practices by readjusting business processes, responsibilities
and skill sets to be able to respond to the increased level of incidents detected and resolution through
observation.

Strategic Planning Assumption


By 2026, 50% of enterprises implementing distributed data architectures will have adopted data observability
tools to improve visibility over the state of the data landscape, up from less than 20% in 2024.

Market Definition
This document was revised on 10 July 2024. The document you are viewing is the corrected version. For
more information, see the Corrections page on gartner.com.

Data observability tools are software applications that enable organizations to understand the state and health
of their data, data pipelines, data landscapes, data infrastructures, and the financial operational cost of the data
across distributed environments. This is accomplished by continuously monitoring, tracking, alerting, analyzing
and troubleshooting data workflows to reduce problems and prevent data errors or system downtime. The tools
also provide impact analysis, solution recommendation, collaboration and incidence management. They go
beyond traditional network or application monitoring by enabling users to observe changes, discover unknowns
and take appropriate actions with goals to prevent firefighting and business interruption.

Organizations are looking to ensure data quality across different stages of the data life cycle. However,
traditional monitoring tools are insufficient to address unknown issues. Data observability tools learn what to
monitor and provide insights into unforeseen exceptions. They fill the gap for organizations that need better
visibility of data health and data pipelines across distributed landscapes well beyond traditional network,
infrastructure and application monitoring.

Mandatory Features
Monitor and detect to answer the question “what went wrong and what is happening?”: By connecting to the
data source and collecting and analyzing signals from various channels, data observability tools monitor data
content against business rules, policies or standards. They detect changes in data (nulls, typecastings,
min/max, row counts, etc.), and monitor data flow to determine how components of data pipelines are
operating, evaluate whether data quality meets expectations and detect data-related issues. The tools have
embedded monitoring dashboards and reports to display the real-time or point-of-time status.

Alert and triage to answer the question “who should be notified to work on the issue, and when?”: Data
observability tools can determine the urgency and severity of issues by identifying the lineage and usage of
data. If the data quality status goes beneath the threshold, the tools will send alerts to the right people at the
right time. Notification and frequency can be configured based on the alert levels. IT tickets can also be
automatically created, and owners assigned.

Investigate to answer the question “why did it happen and what are the possible impacts?”: The tools provide
lineage and graphs to show the origins, consumption and flow of data within the data pipelines, for
investigating issues and locating the root cause of the problems. The tools may also perform AI/ML-driven
outage analysis on historical patterns.

Common Features
Recommend solutions to answer the question “how can it be fixed?”: Some alerts can be for information
only; no action is required. This is useful when a system is legacy and brittle, and can only be informed. But
other issues are critical and require immediate solutions. The tools may provide recommendations based on
root cause analysis results. Only vendors with advanced technologies offer this feature in their data
observability tools, and recommendations aren’t always available for all types of issues. This is a
differentiating factor among vendors.

Market Description
Traditional infrastructure and application performance monitoring tools are event-based tools that focus on
specific areas of the data ecosystem, with an assumption that the organization knows what it must monitor. As a
result, the tools are insufficient in addressing issues new to the organization or providing in-time support to
prevent critical data issues or system downtime.

Data observability tools fill this gap by consolidating information from different areas of the data ecosystem and
creating consistent and coherent alerts for data issues, regardless of their origins, to provide a holistic, end-to-
end visibility of the data ecosystem. These critical features of data observability are shown in Figure 1.

Figure 1: Critical Features of Data Observability

CriticalFeaturesofDataObservability

Monitorand
detect
Whatwent
wrong?

Resolveand Alertandtriage
prevent
Whoshouldbe
Endgoal notifiedand
when?

Recommend Investigate
Howcanit Whyithappened
befixed? andpossible
impacts?

Source:Gartner
765184_C

Gartner.

Data observability tools learn what to monitor and provide insights into unforeseen exceptions, focusing on the
following critical areas shown in Figure 2:

Data content: Improving data content’s quality by:

Calculating data quality metrics such as completeness, uniqueness and accuracy of data

Identifying anomalies, outliers, patterns and violations against business rules

Detecting changes in schema, volume and data quality level

Data flow and pipeline: Ensuring data pipelines do not experience any interruption by:

Monitoring the data pipelines’ components

Identifying issues in data execution jobs, events, applications or code interactions

Checking for drifts in schema, codes or configurations

Finding bottlenecks, broken pipelines, or failed or incomplete jobs

Infrastructure and compute: Ensuring the data ecosystem has sufficient resources by:

Capturing operational metadata from various sources, such as system logs and trace files

Verifying that the resource consumption (e.g., compute, performance, storage, network) is below the
threshold

Monitoring and analyzing current and scheduled workload, and forecasting the necessary resources

User, usage and utilization: Helping organizations better understand how their data is used by:

Determining who owns, changes and reads the data

Identifying how often a user accesses specific data

Assessing the number of queries running against the data, the most frequent query and the total
execution time in a certain period

Financial allocation: Reducing the cost of data ecosystem by:

Analyzing the underlying cost associated with each dataset

Providing information necessary for cost optimization, resource and capacity planning, budgeting and
forecasting

Figure 2: Current Landscape of Data Observability — Five Observation Categories

Currently, the data observability market offers these capabilities across five main observation areas as
embedded or stand-alone tools, although most tools may not cover all five areas on their own. Table 1 highlights
how data observability tools may offer each level of capability for different observation areas.

Table 1: Data Observability Features Across Five Observation Categories Enlarge Table

Level 1 Level 2 Level 3 Level 4


Monitor and detect Alert and triage Investigate Recommend

Data content Data catalog Semantic drift Impact Data policy


alert analysis with enforcement
Data profiling
lineage
Rule violation Auto data
Data quality
alert Pattern and remediation
assessment
trend analysis
Data quality Data quality rule
Anomaly
with historical
threshold recommendation
detection
data
alert

Data flow ELT jobs Schema or Data pipeline Query


and pipeline monitoring code drift lineage optimization
alert
Data pipeline Workload Pipeline and
monitoring Data pipeline analysis workload
failure alert optimization
Drift detection Broken
over schema, Data pipeline pipeline and
codes and performance failed job
schedules alert analysis
Source: Gartner (June 2024)

Use-Case Example

An organization’s D&A leader may use a data observability tool to monitor the entire flow of data and operations
for an unreliable data pipeline. Typical monitoring targets might be changes in schema and structure of data
used in data pipelines, codes of an extract, transform and load (ETL) job, data operation’s metadata and resource
consumptions.

When an issue occurs within the monitoring targets, the data observability tool would send an alert to data
engineers that can investigate the issue. The tool would also provide critical information for the investigation,
such as workload analysis, data lineage and query performance analysis. Then, some data observability tools
might generate recommendations for optimizing queries, job scheduling and workload based on their analysis.

Market Direction

Demand for Data Observability Will Increase


Gartner sees strong growth for data observability’s demand in the next few years. According to Gartner’s Chief
Data and Analytics Officer Agenda Survey for 2024, 22% of respondents said they have already implemented
data observability tools. In addition, 38% of respondents claimed they will be piloting or deploying the tools
within 12 months, and 28% said they will pilot or deploy the tools within one to two years. Most of the D&A
leaders (65%) also responded that data observability would be a core element of their data strategy within two
1
years.

One of the leading causes for the high demand for data observability is the huge demand for D&A leaders to
implement emerging technologies, particularly generative AI (GenAI), in their organization. Nine percent of
respondents in the 2024 Gartner CIO and Technology Executive Survey indicated they had already deployed
GenAI; 34% of respondents said they would do so in 2024. 2 GenAI ranked as the technology most selected to
be deployed within one year among 15 emerging technologies.

Such high demand for emerging technologies like GenAI is increasing the need for distributions of data
landscapes, diversity of datasets and need for data quality. As a result, data observability is becoming a critical
technology for supporting AI-ready data, along with metadata and D&A governance tools (see Figure 3 and
Quick Answer: What Makes Data AI-Ready?). Data observability tools provide continuous monitoring and
assessment to make sure the enterprise data is ready for AI model training and consumption, such as:

Consistency assessment

Validation and verification

Continuous regression testing

Inference and derivation

Observability metric

Monitoring and detection

Figure 3: Key Tools to Make Your Data AI-Ready: Data Observability

Observation Area and Environment Coverage Will Expand


Although the market as a whole covers the five main areas of observation, only a few vendors cover every area.
Currently, most vendors focus on data content and data flow and pipelines areas. Coverage in the infrastructure
and compute and user, usage and utilization areas are limited to specific environments, and the financial
allocation area is often seen as a nice-to-have add-on by the vendors.

Modern data stacks (see Quick Answer: What Does the Modern Data Stack Trend Mean for D&A Product
Leaders?) are the primary focus area covered by vendors in this market. In fact, most data observability vendors
support only the cloud environments. This limits their application in large enterprise environments, which are
more complex data landscapes and are often mixed with on-premises, legacy and cloud environments.

However, as the demand grows for data observability, Gartner sees a strong growth among vendors to expand
their coverage areas of observability and the variety of data landscapes. The increasing complexity in data
ecosystems will favor comprehensive data observability tools that can provide additional utilities beyond the
monitoring and detection of data issues across platforms. In addition, there’s a growing demand for data
observability tools to support a large variety of data environments.

This trend toward comprehensive tools will naturally foster a competition for larger observation areas among
vendors. Such competition will trigger additional observation areas to form as well, such as data privacy, data
security, BI/analytics, vector database and AI/ML developments. In the near term, while D&A leaders are
embracing the observability concept, adopting data observability tools, and resonating with the business value
of the tools, the supplier markets will continue to grow by providing more variety of observations. It’s likely that
data observability features will be engulfed as capabilities in a broader D&A ecosystem. Vendors in the overall
data and analytics markets will also participate in the developments of this tool.

Market Analysis

Confusion With Data Quality Solutions or General


Observability/Application Monitoring Tools
People often look at data observability primarily through the lens of data quality and see them both as
interchangeable terms. While they are similar and have overlapping areas, data quality and observability are very
different concepts.

For example, data quality is concerned with data itself from a business context, while data observability has
additional interests, and it concerns the system and environment that deliver that data. Data quality provides
data remediation capability and helps fix data issues, whereas data observability offers monitoring and
observation as a baseline and may provide recommendations to fix the issue.

However, there is no enforcement to take the recommendation in data observability tools. Users need to use
other mechanisms to execute the recommendations or resolve issues. These two technologies (data
observability and data quality) overlap in the following technical capabilities:

Data profiling

Data quality/data content monitoring

Metadata management

Data lineage

Combining Data Observability With Data Quality

In recent years, some data quality vendors include additional observability features in their products — such as
observing data content and flow. This trend among data quality vendors is because data quality and
observability can work together to improve the insights gleaned from the collected data. Data and analytics
leaders looking to gain the most value from their organization’s data need to maximize both data quality and
data observability. For more details on data quality market offerings, refer to Gartner’s Magic Quadrant for
Augmented Data Quality Solutions.

Data observability tools are also often confused with general observability tools that are typically related to
application performance monitoring (APM) tools. APM and observability tools are powerful analytics platforms
that ingest multiple telemetry feeds and provide critical insight into application health, performance and,
increasingly, security. They are not intended to monitor the data or anything associated with data. However,
APM tools and data observability tools have a common interest in monitoring infrastructure resources. For more
details about the current APM and general observability tools market, refer to Magic Quadrant for Application
Performance Monitoring and Observability.

Split Market: Embedded and Stand-Alone Tools


Currently, the data observability market is split between embedded and stand-alone tools with unique
capabilities. Some existing tools — such as data quality solutions, DataOps tools, data warehouse platforms and
ETL tools — have embraced data observability capabilities and offer them as add-ons or embedded data
observability features. At the same time, numerous startup vendors provide stand-alone data observability tools
as a dedicated offering for end-to-end observations, creating a new market for data observability.

Embedded Data Observability Tools


Embedded tools are embedded or integrated into specific applications areas or environments and provide a
seamless process for specific observability within their own platforms to identify data issues. However, they are
not intended to provide an end-to-end visibility across five different observations and are difficult (if not
impossible) to use outside of the main tool or environment that data observability tool is embedded in.

For example, a data observability feature embedded in a DataOps tool is limited to the context of DataOps.
Therefore, the tools with embedded data observability features typically focus on one or two observations out of
five. These tools are not intended for enterprise-scale, end-to-end data observability across different
environments. Vendors such Ataccama and Collibra have data observability features embedded in their data
quality solutions with focus on observing data content. DataKitchen, a DataOps vendor, includes data flow and
data pipeline observability in the tool.

Stand-Alone Data Observability Tools


Stand-alone tools have interoperability with other tools or applications and are able to connect to a wide range
of data environments, allowing them to provide a more comprehensive view of the data ecosystem. Stand-alone
tools can also support multiple observation categories and use cases. However, stand-alone tools require a more
burdensome implementation process and optimization to harness their full potential. Vendors such as
Acceldata, Bigeye, Monte Calo offer stand-alone data observability tools.

Market Fragmentation Provides Targeted Solutions Based


on Specific Use Cases
Because data observability is still an emerging technology, there is no standard definition for data observability,
nor is there general agreement across vendors on what it should cover. The current market is fragmented, with
very few vendors offering comprehensive tools that cover all five observation areas (data content; data flow and
pipelines; infrastructure and compute; user, usage and utilization; and financial allocation). Vendors offer varying
observation area coverage, deployment options and capabilities, and most data observation tools in the market
focus on specific needs.

D&A leaders can use the fragmented market to their advantage by choosing a targeted solution for their unique
needs. Diverse offerings allow D&A leaders to select a tool that can best fill the gap between their current and
desired capabilities. The tool provides crucial monitoring features to address critical data elements, pipelines or
sources with high standards, or service-level agreements (SLAs) in quality, uptime, latency and performance.

Representative Vendors
The vendors listed in this Market Guide do not imply an exhaustive list. This section is intended to provide
more understanding of the market and its offerings.

Vendor Selection
Gartner estimates there are more than 30 vendors that cover at least one of the observation areas outlined in
this guide. Table 2 includes both embedded and stand-alone data observability tools. This list is based on the
top vendors of interest through Gartner client inquiries.

Table 2: Representative Vendors in Data Observability Tools Enlarge Table

Vendor Headquarters Product Name

Acceldata California, U.S. Acceldata Data Observability Cloud

Acryl Data California, U.S. Acryl DataHub

Adeptia Illinois, U.S. Adeptia Connect

Anomalo California, U.S. Anomalo

Ataccama Toronto, Canada Ataccama ONE

Bigeye California, U.S. Bigeye

Collibra New York, U.S. Collibra Data Quality & Observability

DataKitchen Massachusetts, U.S. DataOps Observability Software

DataOps.live England, U.K. DataOps.live


Source: Gartner (June 2024)

Market Recommendations
Gartner recommends D&A leaders take these actions to consider both technical and nontechnical aspects when
navigating through data observability:

Identify the gaps. There’s no need to tear down what you already have. Assess the gap between your current
monitoring capabilities (via traditional data quality or DataOps tooling) and desired capabilities regarding
critical data elements. These gaps are ideal use cases for piloting data observability tool implementations.

Evaluate vendors with both technology and business in mind. Engage both business and technical persona
early in the vendor evaluation process since they may have different requirements and expectations. Evaluate
data observability tool offerings based on the priority of business requirements, primary users and how the
tools fit in the overall enterprise ecosystems.

Consider the variety of connectors supported by vendors. Given the increasingly complex ecosystems and
the amount of similar, and in some cases overlapping, capabilities, ensuring that this technology integrates
and connects to your current ecosystem is critical.

Pilot first, optimize later. If available, implement a data observability tool in a cloud environment first because
it is faster and easier to demonstrate value. Prioritize assessing the business value and return of the data
observability tool rather than ensuring its technology optimization throughout the data ecosystem.

Consider both capabilities and requirements. When evaluating a data observability tool, consider adaptation
or adjustment in processes, responsibilities and skill sets necessary for securing the business values from
data observability practices.

Show tangible business benefits. Partner with business stakeholders to evaluate and demonstrate the
business value of data observation practices. Include both business and technical users in the notification
strategy, if necessary. Track the improvement of data quality within data pipelines, as well as their impact in
business outcomes.

Evidence

Note 1: Gartner’s Initial Market Coverage


This Market Guide provides Gartner’s initial coverage of the market and focuses on the market definition,
rationale and dynamics.

© 2025 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. This publication may not be
reproduced or distributed in any form without Gartner's prior written permission. It consists of the opinions of Gartner's research organization, which should
not be construed as statements of fact. While the information contained in this publication has been obtained from sources believed to be reliable, Gartner
disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research may address legal and financial issues,
Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are
governed by Gartner’s Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its
research organization without input or influence from any third party. For further information, see "Guiding Principles on Independence and Objectivity."
Gartner research may not be used as input into or for the training or development of generative artificial intelligence, machine learning, algorithms, software,
or related technologies.
About Careers Newsroom Policies Site Index IT Glossary Gartner Blog Network Contact Send Feedback

© 2025 Gartner, Inc. and/or its Affiliates. All Rights Reserved.

You might also like