TDWI DataQuality Maturity Model Assessment Guide 2024 Web
TDWI DataQuality Maturity Model Assessment Guide 2024 Web
By Fern Halper
TDWI RESEARCH
2024
• NEW ALGORITHMS FOR DATA QUALITY. Already new algorithms, approaches, and metrics are
emerging to address the data quality of new data types and analytics. One example is BLEU,
which stands for Bilingual Evaluation Understudy, a metric used for evaluating the quality of
text that has been machine-translated from one language to another. Open source platforms
provide tools to detect bias and toxicity. Some vendors are providing confidence scores for the
output of generative AI models or allowing for the building of predictive models to assess for
hallucinations.
• AUTOMATION AND AUGMENTATION. Given the volume of new data being used by organizations, a
major trend driving data quality management is the introduction of automated and augmented
tools. Modern data quality tools automate profiling, monitoring, parsing, standardizing,
matching, merging, correcting, cleansing, and enhancing data for delivery into enterprise data
warehouses and other downstream repositories. Additionally, AI technology is being infused
into the technology to augment it and enable tasks such as identifying sensitive information
or detecting outliers. These tools are found in data pipeline products as well as data catalogs,
among others. TDWI research has found that organizations that succeed with data quality are
more likely to use automated tools to manage the complexity.
• MONITORING AND OBSERVABILITY. Metrics are key for success with data quality. Data
observability is a relatively new method of ensuring data is healthy in real time and data
governance objectives are being met. It uses automated techniques to enforce consistency,
accuracy, and reliability of data. It can act as a single pane of glass across platforms to monitor
and manage data quality. TDWI sees observability tools gaining market traction.
• ROLES AND RESPONSIBILITIES: This dimension assesses the roles in place for data quality and
whether someone is accountable for data quality. It includes questions about both technical
and business staff responsibilities for data quality. Is there training for data quality? Are there
processes in place so people know if their data is of high quality?
• DATA QUALITY MANAGEMENT: This dimension examines the tools and processes in place to make
sure data is accurate, reliable, and relevant. It examines established processes for exposing and
remediating poor data quality. It also examines the importance of data quality in analytics and
responsible data and analytics.
• ASSURANCE AND IMPACT: This dimension examines how well data quality processes are working
vis-à-vis data integrity. It includes questions about data trust, compliance, and risk mitigation. It
also examines how the organization is measuring data quality.
• TOOLS: Organizations rely on a range of tools to manage and improve data quality. This
dimension tracks the availability and adoption of tools for data quality and how advanced they
are. Such tools include automated solutions as well as monitoring and observability solutions
that look at the health of the data.
Stages of Maturity
The TDWI Data Quality Maturity Model consists of five stages: Nascent, Early, Established,
Comprehensive, and Advanced/Visionary. As organizations move through these stages, they should
gain more value from their data and analytics investments. Figure 2 illustrates these stages.
This guide provides a brief overview of each of the stages of the TDWI Data Quality Maturity
Model. This description provides a context for interpreting your scores when taking the assessment.
Overview of Stages
Stage 1: Nascent
In the Nascent stage, organizations have no company-wide data quality strategy, funding, or
awareness by executives about why data quality is vital.
Companies in this stage typically do not maintain an inventory of data assets to aid in
standardization and understanding. The organization may be implementing data quality teams,
processes, and platforms but these are usually in pockets with no overall data quality management
practice anywhere in the organization. Where data quality processes exist, they are focused on
traditional data types.
Although many in the organization may not be satisfied with the quality of their data, the value of
data quality management is not communicated throughout the organization and it hasn’t become a
priority. There are minimal processes to ensure accurate, reliable, relevant, complete, or timely data.
Data quality management tools are virtually nonexistent. Many in the company may not understand
what ensuring high-quality data entails and the enterprise may report problems with data accuracy,
consistency, integrity, compliance, and relevance.
The nascent organization risks falling into a situation where it may become a victim of its own lack of
attention to data quality and have trouble moving forward with analytics and other efforts.
Stage 2: Early
In the Early stage, the organization is starting to think about a more formal data quality program. It
is putting objectives in place and starting to make executives aware of the value of data quality. This
may be because an event has occurred where poor data quality impacted revenue (for instance, an
inaccurate forecast).
The company is trying to put the foundations of data quality management in place to ensure
complete and accurate data—primarily for traditional structured data. In addition to putting
processes in place for data quality, the organization is also starting to think about data quality tools,
typically for data profiling and cleansing. Stakeholders are beginning to emerge as the value of data
quality begins to be better communicated across the company.
The Early-stage organization is starting to take the steps needed to establish basic data quality
management.
Stage 3: Established
During the Established stage, there is an organizational strategy and plan in place for data quality.
This may also include preliminary metrics about what constitutes high-quality data in terms of
accuracy, completeness, timeliness, and other core data quality parameters. These metrics might be
in place for traditional structured data. Accountability and data standards are established and the
importance of data quality is being communicated across the organization.
The enterprise will most likely use tools to profile and cleanse data, but these tools may not yet be
automated. They are researching other tools. Likewise, the organization may have started to implement—
or at least investigate—tools such as data catalogs and data lineage solutions to help standardize data,
provide metadata, and better understand where problems with data could have occurred.
Those organizations in the Established phase will need to keep evolving their data quality strategy
because they will also most likely be in a position to start to collect and analyze more diverse data
and greater data volumes. This will necessitate new processes and tools, including automation.
Stage 4: Comprehensive
During the Comprehensive phase, the organization has implemented a sound data quality program
and the results are measured. Data quality is an agenda item in data governance meetings. Tools are
in place for core data quality metrics. Data quality has become a strategic imperative. The enterprise
is now building on what it has to take it to the next level with new data types, such as text, machine,
and real-time data.
These organizations may have moved forward in their analytics efforts, including building machine
learning models. That means they are also dealing with assessing both the data quality of the input
to models and what comes out of the models. Enterprises are determining what constitutes high-
quality data for new data types and implementing processes and new algorithms for determining this.
They are also working on identifying ethical considerations for data, such as data bias and fairness.
Organizations with comprehensive data quality programs are also investing in tools to help automate
different parts of the data quality and data pipeline process. This includes augmented tools to alert
them to issues in their data and observability tools so enterprises can measure the health of their data.
All of this is important as their data ecosystem becomes increasingly complex.
Stage 5: Advanced/Visionary
Very few organizations reach the Advanced/Visionary stage of data quality. That is partly because
data quality management is, in many ways, a moving target. Visionary companies have put
standard processes in place to evolve their data quality strategies as they utilize more diverse data.
Stakeholders in both business and IT are accountable for data quality, and team structures are in
place. Collaboration occurs across the organization on data quality.
These visionary companies are also using automated and augmented tools for data quality. These
tools may be part of a larger solution, such as a data catalog or a data observability platform. They
may be in the data pipelines. Metrics associated with data quality tie to business goals and these are
consistently measured and publicized.
5 or less Nascent
6-10 Early
11-14 Established
15-18 Comprehensive
19-20 Advanced/Visionary
For instance, if you receive a score of 11 in the Organizational Commitment dimension of the
assessment, you are in the Established stage for that dimension. You should expect to see your scores
vary across the different dimensions. Data quality programs don’t necessarily evolve at the same rate
across all these dimensions.
When you complete the assessment, you might see scores such as this:
Tools 5 Nascent
This means you are more mature in your Organizational Commitment category but less mature in
some of the other areas. Understanding your relative strengths and weaknesses will help you establish
goals and, in turn, target your efforts and allocate resources.
Summary
The TDWI Data Quality Maturity Assessment provides a quick way for organizations to assess their
maturity in data quality. The assessment is based on the TDWI Data Quality Maturity Model, which
consists of five maturity stages.
The assessment serves as a relatively coarse measure of your data quality maturity. The approximately
50 questions across five categories merely touch the surface of the complexities involved in building
your data quality program. To gauge precisely where you are, it may also make sense to work with an
independent source to validate your progress.
E [email protected] tdwi.org