0% found this document useful (0 votes)
13 views

Module 7. Data Quality

The document outlines the importance of Data Quality Management and provides a framework for understanding and ensuring data quality through various dimensions such as validity, reliability, timeliness, precision, and integrity. It discusses threats to data quality and offers strategies for managing these threats during different stages of data handling, including source, collection, collation, analysis, reporting, and usage. Additionally, it emphasizes the need for audits and training to maintain high data quality standards.

Uploaded by

Brhoom Mansoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Module 7. Data Quality

The document outlines the importance of Data Quality Management and provides a framework for understanding and ensuring data quality through various dimensions such as validity, reliability, timeliness, precision, and integrity. It discusses threats to data quality and offers strategies for managing these threats during different stages of data handling, including source, collection, collation, analysis, reporting, and usage. Additionally, it emphasizes the need for audits and training to maintain high data quality standards.

Uploaded by

Brhoom Mansoor
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 42

Data Quality Management

Learning Objectives
 At the end of this session, participants will be
able to:
 Summarize basic terminology regarding data
quality management
 List and describe the 5 threats to quality data
 Identify possible threat to data quality in an
information system
 Generate a plan to managaged identify threats to
data quality
Why M&E is important

 M&E promotes organizational learning and


encourages adaptive management
Content of Workshop Session
 What is Data Quality?
 The criteria for data quality
 Constructing a data quality plan
 Data quality auditing
What is Data Quality ?
 Chess game of cost versus quality
 Criterion based evaluation of data
 Criterion based system of data management
Dimensions of Data Quality
 Validity
 Reliability
 Timeliness
 Precision
 Integrity
Definition of Validity

 A characteristic of measurement in which a


tool actually measures what the researcher
intended to measure
 Have we actually measured what we
intended?
Threats to Validity
 Definitional issues
 Proxy measures
 Inclusions / Exclusions
 Data sources
Validity:
Questions to ask yourself…

Is there a relationship between the activity or


program and what you are measuring?
What is the data transcription process? Is there
potential for error?
Are steps being taken to limit transcription error
(e.g., double keying of data for large surveys,
built in validation checks, random checks)?
Definition of Reliability
 ‘A characteristic of measurement concerned
with consistency’
 Can we consistently measure what we
intended?
Threats to Reliability I

Time

Place
People
Threats to Reliability II

 Collection methodologies
 Collection instruments
 Personnel issues
 Analysis and manipulation methodologies
Reliability:
Questions to ask yourself…

 Is the same instrument used from year to year,


site to site?
 Is the same data collection process used from
year to year, site to site?
 Are there procedures in place to ensure that
data are free of significant error and that bias
is not introduced (e.g., instructions, indicator
information sheets, training, etc.)?
Definition of Timeliness

 The relationship between the time of collection,


collation and reporting to the relevance of the
data for decision making processes.
 Does the data still have relevance and value
when reported?
Threats to Timeliness

 Collection frequencies
 Reporting frequencies
 Time dependency
Timeliness:
Questions to ask yourself…

 Are data available on a frequent enough basis to


inform program management decisions?
 Is a regularized schedule of data collection in place
to meet program management needs?
 Are data from within the reporting period of interest
(i.e. are the data from a point in time after the
intervention has begun)?
 Are the data reported as soon as possible after
collection?
Definition of Precision

 Accuracy (measure of bias)


 Precision (measure of error)
 Is the margin of error in the data less than the
expected change the project was designed to
effect?
Threats to Precision

 Source error / bias

 Instrumentation error

 Transcription error

 Manipulation error
Precision:
Questions to ask yourself…

 Is the margin of error less than expected change


being measured?
 Are the margins of error acceptable for program
decision making?
 Have issues around precision been reported?
 Would an increase in the degree of accuracy be
more costly than the increased value of the
information?
Good Data are Valid, Reliable and Precise

≠ Accurate/Valid ≠ Accurate/Valid  Accurate/Valid


≠ Reliable  Reliable  Reliable
≠ Precise  Precise  Precise

X X X XXX
XX XXX
XXXX
X X
XXX
X XXXX
X XXX
X
Definition of Integrity

 Measure of ‘truthfulness’ of the data


 Is the data free from ‘untruth’ introduced
by either human or technical means,
whether willfully or unconsciously?
Threats to Integrity I

Time

Temptation
Technology
Threats to Integrity II

 Corruption, intentional or unintentional


 Personal manipulations
 Technological failures
 Lack of audit verification and validation
The Data Quality Plan

 Operational Plan for managing data quality


 Indicator Information Sheets
 Includes a Data Quality Risk Analysis
 Includes an audit trail reference
Framework for Data Quality
Assessments
Data Management System Data Quality System Auditable

Risk Verification
System
Data Management Processes / Data Quality Processes /
Procedures Procedures
Source
Collection
Collation Validity
Analysis Paper Trail
Reliability that allows
Reporting verification of
Integrity the entire
DMS and the
data produced
Precision within it
Usage
Timeliness
Relationships with a Data System
Data Quality Issues at SOURCE
 The potential risk of poor data quality increases
with secondary and tertiary data sources
 Examples:
 Validity: data could be incomplete (incomplete Drs
notes, ineligible notes in patient files)
 Reliability: inconsistent recording of information by
different staff because of differing skills levels
To Ensure Data Quality at
SOURCE
 Design instruments carefully and correctly
 Include data providers (community stakeholders)
and data processors in decision to establish what
is feasible to collect, to review process, and to
draft instruments.
 Develop & document instructions for the data
collectors, on the collection forms, and for
computer procedures
To Ensure Data Quality at
SOURCE
 Ensure all personnel are trained in their assigned
task. Use 1 trainer if possible
 Develop an appropriate sample
Data Quality Issues at
COLLECTION
 Incomplete entries in spreadsheets
 incorrect data transcriptions
 data entered in wrong fields in a database
 Inconsistent entries of data by different data
capturers
To ensure data quality during
COLLECTION
 Develop specific instructions for data collection
 Routinely check to see if instructions are being
followed
 Identify what to do if you (or someone) wants to
make a change to the data collection process or if
you have problems during data collection (change
mgmt process)
 Check to see if people follow the change
management process
 Ensure all data collection, entry and analysis needs
are available (pens, paper, forms, computers)
To ensure data quality during
COLLECTION
 Train data collectors in how to collect information
 Develop SOPs for managing the collected data
(e.g. moving data from 1 point to the next)
 Develop SOPs for revising the collection tool
 Communicate the process and SOPs
 Conduct on-site reviews during the process
To ensure data quality during
COLLATION
 Develop check lists and sign off for key steps
 Conduct reviews during entry process
 Create an electronic or manual format that
includes a data verification process by a second
individual who is not entering the data
To ensure data quality during
COLLATION
 Randomly sample data and verify
 Ensure problems are reported and documented,
corrected and communicated and tracked back
to the source of the problem
To ensure data quality during
ANALYSIS
 Ensure analysis techniques meet the
requirements for proper use
 Disclose all conditions /assumptions affecting
interpretations for data
 Have experts review reports for reasonableness
of analysis
To ensure data quality during
REPORTING
 Synthesize results for the appropriate audience
 Maintain integrity in reporting – don’t leave out
key info
 Have multiple reviewers within the organization -
prior to dissemination!
 Protect confidentiality in reports / communication
tools
 Review data / provide feedback with those who
have a stake in the results
To ensure data quality during
USAGE

 Understand your data !!


 Use your data !!
Minimizing Data Quality Risks
 Technology
 Ensuring that data analysis/statistical software is up-to-date.
 Streamlining instruments and data collection methods.
 Competence of personnel
 Ensuring that staff are well-versed in all stages of the Data
Management process (data collection, entry, assessment, risk
analysis, etc).
 Proficiency with data software.
 Documentation and audit trails
 Outsourcing
Data Quality Audits
 Verification
 Validation
 Self-assessment
 Internal audit
 External audit
DQA Process
Data quality training
Close non-
Data quality plans
compliances
Construct audit plan
Correct data Plan
practices
Clean database
Act Do
Review self- Check Self-evaluation
evaluations Data input
Audit input from Run error logs
partners Report
Review error logs generation
Audit data in database
Audit the output The auditor is responsible for
reports the areas indicated in yellow
Submit audit report
M&E Work Plan tasks
 Identify the risks associated with your current data
management practices and assign a risk value to them
 Identify the contingency plans needed to improve the
data quality practice
 Complete a Data Quality Plan for one of the Indicators
you will be reporting against.
Acknowledgements
 This presentation was the result of on-going
collaborations between:
 USG – The President’s Emergency Plan for
AIDS Relief in South Africa
 USAID
 MEASURE Evaluation
 Khulisa Management Services
MEASURE Evaluation is funded by the U.S. Agency for
International Development (USAID) through Cooperative
Agreement GPO-A-00-03-00003-00 and is implemented by
the Carolina Population Center at the University of North
Carolina in partnership with Futures Group, John Snow,
Inc., Macro International, and Tulane University. Visit us
online at http://www.cpc.unc.edu/measure

You might also like