Gartner Reprint Observability
Gartner Reprint Observability
Data observability tools emerged as a market to offer a robust and holistic view of the health of data, enabling
users to observe changes, discover unknowns and take actions as needed. This research outlines how data and
analytics leaders can best leverage the data observability market’s offerings.
Overview
Key Findings
Static, event-based monitoring is no longer sufficient to effectively manage data systems and prevent critical
events. Modern data architectures are too complex and dynamic for those methods to provide a holistic view
of data’s health across data ecosystems at various stages of its life cycle.
Vendors are offering a range of different capabilities branded as data observability, causing confusion in the
market and tool adoption issues due to the lack of a standard accepted definition.
Although the current data observability market focuses on data content, as well as data flow and pipeline,
Gartner sees a strong opportunity for vendors to expand their coverage in observation areas and supported
data environments.
There are two types of data observability tools in the market: embedded tools and stand-alone tools.
Embedded tools are easier to implement but have limited coverage, whereas stand-alone tools have vast
capabilities but require more rigorous adaptation process and technology optimization.
Recommendations
D&A leaders assessing the data observability market must:
Identify gaps in the current data ecosystem, and use them as opportunities for piloting data observability
tools. Areas where current monitoring cannot sufficiently identify critical issues, SLAs are not met or
delivering data requires long-running tasks are ideal piloting opportunities for testing data observability tools.
Evaluate vendors with both technology and business in mind. Engage both business and technical personas
early in the vendor evaluation process to evaluate a data observability tool based on the business and
enterprise ecosystem’s requirements.
Prioritize piloting data observability tools with minimal technology requirements. Target cloud environments
first (which are the primary focus among vendors and allow easier data observability tool implementations) to
quickly assess the improvement of data quality within data pipelines and the impact on business outcomes.
Secure business value from data observability practices by readjusting business processes, responsibilities
and skill sets to be able to respond to the increased level of incidents detected and resolution through
observation.
Market Definition
This document was revised on 10 July 2024. The document you are viewing is the corrected version. For
more information, see the Corrections page on gartner.com.
Data observability tools are software applications that enable organizations to understand the state and health
of their data, data pipelines, data landscapes, data infrastructures, and the financial operational cost of the data
across distributed environments. This is accomplished by continuously monitoring, tracking, alerting, analyzing
and troubleshooting data workflows to reduce problems and prevent data errors or system downtime. The tools
also provide impact analysis, solution recommendation, collaboration and incidence management. They go
beyond traditional network or application monitoring by enabling users to observe changes, discover unknowns
and take appropriate actions with goals to prevent firefighting and business interruption.
Organizations are looking to ensure data quality across different stages of the data life cycle. However,
traditional monitoring tools are insufficient to address unknown issues. Data observability tools learn what to
monitor and provide insights into unforeseen exceptions. They fill the gap for organizations that need better
visibility of data health and data pipelines across distributed landscapes well beyond traditional network,
infrastructure and application monitoring.
Mandatory Features
Monitor and detect to answer the question “what went wrong and what is happening?”: By connecting to the
data source and collecting and analyzing signals from various channels, data observability tools monitor data
content against business rules, policies or standards. They detect changes in data (nulls, typecastings,
min/max, row counts, etc.), and monitor data flow to determine how components of data pipelines are
operating, evaluate whether data quality meets expectations and detect data-related issues. The tools have
embedded monitoring dashboards and reports to display the real-time or point-of-time status.
Alert and triage to answer the question “who should be notified to work on the issue, and when?”: Data
observability tools can determine the urgency and severity of issues by identifying the lineage and usage of
data. If the data quality status goes beneath the threshold, the tools will send alerts to the right people at the
right time. Notification and frequency can be configured based on the alert levels. IT tickets can also be
automatically created, and owners assigned.
Investigate to answer the question “why did it happen and what are the possible impacts?”: The tools provide
lineage and graphs to show the origins, consumption and flow of data within the data pipelines, for
investigating issues and locating the root cause of the problems. The tools may also perform AI/ML-driven
outage analysis on historical patterns.
Common Features
Recommend solutions to answer the question “how can it be fixed?”: Some alerts can be for information
only; no action is required. This is useful when a system is legacy and brittle, and can only be informed. But
other issues are critical and require immediate solutions. The tools may provide recommendations based on
root cause analysis results. Only vendors with advanced technologies offer this feature in their data
observability tools, and recommendations aren’t always available for all types of issues. This is a
differentiating factor among vendors.
Market Description
Traditional infrastructure and application performance monitoring tools are event-based tools that focus on
specific areas of the data ecosystem, with an assumption that the organization knows what it must monitor. As a
result, the tools are insufficient in addressing issues new to the organization or providing in-time support to
prevent critical data issues or system downtime.
Data observability tools fill this gap by consolidating information from different areas of the data ecosystem and
creating consistent and coherent alerts for data issues, regardless of their origins, to provide a holistic, end-to-
end visibility of the data ecosystem. These critical features of data observability are shown in Figure 1.
CriticalFeaturesofDataObservability
Monitorand
detect
Whatwent
wrong?
Resolveand Alertandtriage
prevent
Whoshouldbe
Endgoal notifiedand
when?
Recommend Investigate
Howcanit Whyithappened
befixed? andpossible
impacts?
Source:Gartner
765184_C
Gartner.
Data observability tools learn what to monitor and provide insights into unforeseen exceptions, focusing on the
following critical areas shown in Figure 2:
Calculating data quality metrics such as completeness, uniqueness and accuracy of data
Data flow and pipeline: Ensuring data pipelines do not experience any interruption by:
Infrastructure and compute: Ensuring the data ecosystem has sufficient resources by:
Capturing operational metadata from various sources, such as system logs and trace files
Verifying that the resource consumption (e.g., compute, performance, storage, network) is below the
threshold
Monitoring and analyzing current and scheduled workload, and forecasting the necessary resources
User, usage and utilization: Helping organizations better understand how their data is used by:
Assessing the number of queries running against the data, the most frequent query and the total
execution time in a certain period
Providing information necessary for cost optimization, resource and capacity planning, budgeting and
forecasting
Currently, the data observability market offers these capabilities across five main observation areas as
embedded or stand-alone tools, although most tools may not cover all five areas on their own. Table 1 highlights
how data observability tools may offer each level of capability for different observation areas.
Table 1: Data Observability Features Across Five Observation Categories Enlarge Table
Use-Case Example
An organization’s D&A leader may use a data observability tool to monitor the entire flow of data and operations
for an unreliable data pipeline. Typical monitoring targets might be changes in schema and structure of data
used in data pipelines, codes of an extract, transform and load (ETL) job, data operation’s metadata and resource
consumptions.
When an issue occurs within the monitoring targets, the data observability tool would send an alert to data
engineers that can investigate the issue. The tool would also provide critical information for the investigation,
such as workload analysis, data lineage and query performance analysis. Then, some data observability tools
might generate recommendations for optimizing queries, job scheduling and workload based on their analysis.
Market Direction
One of the leading causes for the high demand for data observability is the huge demand for D&A leaders to
implement emerging technologies, particularly generative AI (GenAI), in their organization. Nine percent of
respondents in the 2024 Gartner CIO and Technology Executive Survey indicated they had already deployed
GenAI; 34% of respondents said they would do so in 2024. 2 GenAI ranked as the technology most selected to
be deployed within one year among 15 emerging technologies.
Such high demand for emerging technologies like GenAI is increasing the need for distributions of data
landscapes, diversity of datasets and need for data quality. As a result, data observability is becoming a critical
technology for supporting AI-ready data, along with metadata and D&A governance tools (see Figure 3 and
Quick Answer: What Makes Data AI-Ready?). Data observability tools provide continuous monitoring and
assessment to make sure the enterprise data is ready for AI model training and consumption, such as:
Consistency assessment
Observability metric
Modern data stacks (see Quick Answer: What Does the Modern Data Stack Trend Mean for D&A Product
Leaders?) are the primary focus area covered by vendors in this market. In fact, most data observability vendors
support only the cloud environments. This limits their application in large enterprise environments, which are
more complex data landscapes and are often mixed with on-premises, legacy and cloud environments.
However, as the demand grows for data observability, Gartner sees a strong growth among vendors to expand
their coverage areas of observability and the variety of data landscapes. The increasing complexity in data
ecosystems will favor comprehensive data observability tools that can provide additional utilities beyond the
monitoring and detection of data issues across platforms. In addition, there’s a growing demand for data
observability tools to support a large variety of data environments.
This trend toward comprehensive tools will naturally foster a competition for larger observation areas among
vendors. Such competition will trigger additional observation areas to form as well, such as data privacy, data
security, BI/analytics, vector database and AI/ML developments. In the near term, while D&A leaders are
embracing the observability concept, adopting data observability tools, and resonating with the business value
of the tools, the supplier markets will continue to grow by providing more variety of observations. It’s likely that
data observability features will be engulfed as capabilities in a broader D&A ecosystem. Vendors in the overall
data and analytics markets will also participate in the developments of this tool.
Market Analysis
For example, data quality is concerned with data itself from a business context, while data observability has
additional interests, and it concerns the system and environment that deliver that data. Data quality provides
data remediation capability and helps fix data issues, whereas data observability offers monitoring and
observation as a baseline and may provide recommendations to fix the issue.
However, there is no enforcement to take the recommendation in data observability tools. Users need to use
other mechanisms to execute the recommendations or resolve issues. These two technologies (data
observability and data quality) overlap in the following technical capabilities:
Data profiling
Metadata management
Data lineage
In recent years, some data quality vendors include additional observability features in their products — such as
observing data content and flow. This trend among data quality vendors is because data quality and
observability can work together to improve the insights gleaned from the collected data. Data and analytics
leaders looking to gain the most value from their organization’s data need to maximize both data quality and
data observability. For more details on data quality market offerings, refer to Gartner’s Magic Quadrant for
Augmented Data Quality Solutions.
Data observability tools are also often confused with general observability tools that are typically related to
application performance monitoring (APM) tools. APM and observability tools are powerful analytics platforms
that ingest multiple telemetry feeds and provide critical insight into application health, performance and,
increasingly, security. They are not intended to monitor the data or anything associated with data. However,
APM tools and data observability tools have a common interest in monitoring infrastructure resources. For more
details about the current APM and general observability tools market, refer to Magic Quadrant for Application
Performance Monitoring and Observability.
For example, a data observability feature embedded in a DataOps tool is limited to the context of DataOps.
Therefore, the tools with embedded data observability features typically focus on one or two observations out of
five. These tools are not intended for enterprise-scale, end-to-end data observability across different
environments. Vendors such Ataccama and Collibra have data observability features embedded in their data
quality solutions with focus on observing data content. DataKitchen, a DataOps vendor, includes data flow and
data pipeline observability in the tool.
D&A leaders can use the fragmented market to their advantage by choosing a targeted solution for their unique
needs. Diverse offerings allow D&A leaders to select a tool that can best fill the gap between their current and
desired capabilities. The tool provides crucial monitoring features to address critical data elements, pipelines or
sources with high standards, or service-level agreements (SLAs) in quality, uptime, latency and performance.
Representative Vendors
The vendors listed in this Market Guide do not imply an exhaustive list. This section is intended to provide
more understanding of the market and its offerings.
Vendor Selection
Gartner estimates there are more than 30 vendors that cover at least one of the observation areas outlined in
this guide. Table 2 includes both embedded and stand-alone data observability tools. This list is based on the
top vendors of interest through Gartner client inquiries.
Market Recommendations
Gartner recommends D&A leaders take these actions to consider both technical and nontechnical aspects when
navigating through data observability:
Identify the gaps. There’s no need to tear down what you already have. Assess the gap between your current
monitoring capabilities (via traditional data quality or DataOps tooling) and desired capabilities regarding
critical data elements. These gaps are ideal use cases for piloting data observability tool implementations.
Evaluate vendors with both technology and business in mind. Engage both business and technical persona
early in the vendor evaluation process since they may have different requirements and expectations. Evaluate
data observability tool offerings based on the priority of business requirements, primary users and how the
tools fit in the overall enterprise ecosystems.
Consider the variety of connectors supported by vendors. Given the increasingly complex ecosystems and
the amount of similar, and in some cases overlapping, capabilities, ensuring that this technology integrates
and connects to your current ecosystem is critical.
Pilot first, optimize later. If available, implement a data observability tool in a cloud environment first because
it is faster and easier to demonstrate value. Prioritize assessing the business value and return of the data
observability tool rather than ensuring its technology optimization throughout the data ecosystem.
Consider both capabilities and requirements. When evaluating a data observability tool, consider adaptation
or adjustment in processes, responsibilities and skill sets necessary for securing the business values from
data observability practices.
Show tangible business benefits. Partner with business stakeholders to evaluate and demonstrate the
business value of data observation practices. Include both business and technical users in the notification
strategy, if necessary. Track the improvement of data quality within data pipelines, as well as their impact in
business outcomes.
Evidence
© 2025 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. and its affiliates. This publication may not be
reproduced or distributed in any form without Gartner's prior written permission. It consists of the opinions of Gartner's research organization, which should
not be construed as statements of fact. While the information contained in this publication has been obtained from sources believed to be reliable, Gartner
disclaims all warranties as to the accuracy, completeness or adequacy of such information. Although Gartner research may address legal and financial issues,
Gartner does not provide legal or investment advice and its research should not be construed or used as such. Your access and use of this publication are
governed by Gartner’s Usage Policy. Gartner prides itself on its reputation for independence and objectivity. Its research is produced independently by its
research organization without input or influence from any third party. For further information, see "Guiding Principles on Independence and Objectivity."
Gartner research may not be used as input into or for the training or development of generative artificial intelligence, machine learning, algorithms, software,
or related technologies.
About Careers Newsroom Policies Site Index IT Glossary Gartner Blog Network Contact Send Feedback