Manual - Health Data Quality
Manual - Health Data Quality
Ministry of Health
November 2018
Foreword
The Federal Ministry of Health is currently implementing the Health Sector Transformation Plan
(HSTP), a five year strategic plan from 2015/16-2020. Information Revolution is one of the four
transformation agendas of HSTP with the objective of maximizing the availability, accessibility,
quality, and use of health information for decision making processes through the appropriate
use of ICTs to positively impact the access, quality, and equity of healthcare delivery at all
levels.
Improving data quality and promoting the culture of information use is at the center of the
information revolution agenda. As a result, the Policy, planning and Monitoring &Evaluation
Directorate (PPMED) of the FMOH has developed this data quality training manual which can be
helpful for health workers and managers at all levels of the health system. It will be a useful
guide to improve health data quality and measure data quality at health centers, hospitals,
woreda Health Offices, Zonal Health Departments, Regional Health Bureaus and other health
institutions
I would like to thank Monitoring and Evaluation Case team experts at PPMED, experts from
RHBs, Universities and partner organizations for their great contribution in the finalization of
this manual.
Biruk Abate
ii
Acknowledgements
This manual is developed with inputs from a number of experts from FMOH, RHBs, Universities
and partner organizations. The Policy, Planning and Monitoring & Evaluation Directorate of the
Federal Ministry of Health is grateful for all who have been involved in the preparation of this
manual.
iii
Acronyms
ANC Antenatal Care
DHIS Distinct Health Information System
DHS Demographic Health Survey
DQA Data Quality Audit
DV Data Verification
EMR Electronic Medical Record
FMoH Federal Ministry of Health
HC Health Centre
HCWs Health Care Workers
HF Health Facility
HIS Health Information System
HIV Human Immunodeficiency Virus
HMIS Health Management Information System
HP Health Post
IUCD Intra Uterine Contraceptive Device
LQAS Lots Quality Assurance Sampling
M&E Monitoring & Evaluation
NCoD National Classification of Disease
OPD Out Patient Department
PMT Performance Monitoring Team
PRISM Performance of Routine Information System Management
RDQA Routine Data Quality Assurance
RHB Regional Health Bureau
RHIS Routine Health Information System
SBA Skill Birth Attendant
SNNPR Southern Nations, Nationalities and People Region
TB Tuberculosis
VF Verification Factor
WorHo Woreda Health Office
WHO World Health Organization
ZHD Zonal Health Department
iv
Contents
Foreword ........................................................................................................................................ ii
Acknowledgements ...................................................................................................................... iii
Acronyms ...................................................................................................................................... iv
List of Tables ............................................................................................................................... vii
List of Figures .............................................................................................................................. vii
Section 1: Introduction .................................................................................................................... 1
1.1. Module Description ......................................................................................................................... 1
1.2. Module goals .................................................................................................................................... 1
1.3. Module learning objectives ............................................................................................................ 1
1.4. Description of training methods .................................................................................................... 2
1.5. Target Group ................................................................................................................................... 2
1.6. Core Competencies ......................................................................................................................... 2
1.7. Module duration and class size ...................................................................................................... 2
Section 2: Introduction to Data Quality...................................................................................... 3
Section Objectives ....................................................................................................................................... 3
2.1. Data and Data Quality Definitions ........................................................................................................ 1
2.2. Importance of data quality ................................................................................................................... 1
2.3. Leadership in data quality..................................................................................................................... 2
2.4Symptoms of data quality problem ........................................................................................................ 4
2.5. Challenges in overcoming problems related to data quality ............................................................... 4
2.6 Possible solutions to problems of data quality ..................................................................................... 5
Section3: Health Data Quality Dimensions ................................................................................ 6
Objectives..................................................................................................................................................... 6
3.1 Introduction to data quality dimensions ............................................................................................. 7
3.2 Definitions and examples of the data quality dimensions ...................................................................... 7
Section 4: Data Quality Assurance ............................................................................................ 23
4.1. Quality Assurance and Data Quality Assurance ................................................................................... 24
4.2. Techniques of data quality assurance.................................................................................................. 24
4.2.1 Data Quality Desk review ................................................................................................................... 28
4.2.2. Lot Quality Assurance Sampling (LQAS) ............................................................................................ 29
v
4.2.4. Routine Data Quality Assessment (RDQA) ........................................................................................ 34
Section 5: Using DHIS2 to improve data quality ................................................................................... 52
5.1. Section Introduction ............................................................................................................................ 52
5.2. Data input validation....................................................................................................................... 52
5.3. Min and max ranges............................................................................................................................. 53
5.4. Validation rules ............................................................................................................................... 53
5.5. Outlier analysis................................................................................................................................ 54
5.6. Completeness and timeliness reports ............................................................................................ 54
References ..................................................................................................................................... 55
Annexes ........................................................................................................................................ 56
vi
List of Tables
Table 1: Internal consistency outliers ......................................................... Error! Bookmark not defined.
Table 2: Example of outliers in a given year. ............................................................................................. 10
Table 3: Internal consistency: Trends over time ......................................................................................... 11
Table 4: Example of trends over time ......................................................................................................... 12
Table 5: Internal Consistency: Comparing Related Indicators ................................................................... 13
Table 6: Example: Internal Consistency .................................................................................................... 14
Table 7: External Consistency: Compare with Survey Results ................................................................. 15
Table 8: Example: External Consistency .................................................................................................... 15
Table 9: External Comparison of Population Data ..................................................................................... 17
Table 10: External Comparisons of Population Denominators................................................................... 17
List of Figures
Figure 1: Data, Information and Knowledge. ............................................. Error! Bookmark not defined.
Figure 2 Roles and Responsibilities of each level of the health system for maintaining data quality .......... 3
Figure 3 : PRISM Frame work.................................................................... Error! Bookmark not defined.
vii
Section 1: Introduction
1.1. Module Description
High-quality data are at the core of program activities. Availability of quality data is at the heart
of a functioning evidence-based decision making in the health sector. It is widely recognized
that quality data leads to better clinical and health admin decisions that results in better health
outcomes for the country.
The Federal Ministry of Health (FMoH) has been working towards continuously improving data
and information quality within the health sector. The Ministry reformed the health management
information system in 2008 with the objective of ensuring improved measurement and
standardization towards improvement in quality of data – enabling better decisions and thus
better health outcomes. The reform registered significant improvements in availability and
completeness of source documents and report accuracy. However, data quality is not at the
required level and a lot has to be done if the data is to be relied upon to inform decisions on
health policy, health programs, and allocation of resources.
The overall goal of this training modules to improve data quality at all levels in the health
system, by upgrading knowledge, skills, and attitude of health care workers, health information
managers, and administrators at all levels on techniques of improving quality of health care data
in all its dimensions. The module is designed to address all areas in health care where data are
collected and information generated.
1
• Differentiate the commonly used tools and methods for assessing data quality
• Define and describe the value of monitoring and using data-quality assessment results
over time
• Interactive Lectures
• Group discussion and presentation
• Activity-based site visit
• Case studies
2
Section 2: Introduction to Data Quality
Duration: 2 hours
Section Objectives
At the end of this Section, participants will be able to:
o Describe the concepts of data quality and its importance
o Identify symptoms of data quality problems
o Discuss the role of leadership in data quality management
o Explain the roles and responsibilities of each level of health system for maintaining data
quality
o Discuss the potential challenges and possible solutions of data quality
Teaching Methods
o Brainstorming
o Interactive Lecture
Materials Needed
o Flipchart
o Tape
o Markers
o PowerPoint presentation
o Projector
3
Section Activities:
In general terms, quality data represent what was intended or defined by their official source, are
objective, unbiased and comply with known standards.
For patient/Client
1
• Service users are more likely to receive better and safer care if healthcare professionals
have access to accurate and reliable data to support decision making. Accurate and
reliable patient data, such as results of investigations, information on allergies, past
medical history, potential drug interactions, when readily accessible to the healthcare
professionals supports provision of quality healthcare services.
• Service users are more likely to receive better care if healthcare performance data used to
support quality improvement is of good quality and reflects actual performance.
For Researchers
• Researchers can only be relied on quality data to contribute for improved outcomes by
providing evidence to support particular care processes and beyond.
Many health care administrators already recognize that quality improvement is the way to add
value to the services offered and that the dissemination of quality data is the only way to
demonstrate that value to health care authorities and the community. To ensure better quality
2
health data all health workers and managers at each level should convey their role and
responsibilities.
• Discuss the role of health workers and managers to ensure data quality and categorize by
level (HF, intermediate administrative level (ZHD/WorHo) and central level
(RHB/FMoH).
o Actively participate on the brainstorming session in small groups
o Raise your discussion points to the whole class
Figure 1 Roles and Responsibilities of each level of the health system for maintaining data quality (From
Measure Evaluation)
3
Activity: Discuss the symptoms of poor data quality
o Actively participate on the brainstorming session in small groups
o Write your responses on flipchart
o Compare your responses with the identified symptoms of data quality problems
Activity: In Your small group identify the five most common problems you think that affect the
quality of data and propose actions that could lead to improvements in data quality
Data quality can be affected by different problems across system level some of them are the
following:
Technical determinants
• Lack of guidelines to fill out the data sources and reporting forms
• Data collection and reporting forms are not standardized
• Complex design of data collection and reporting tools
Behavioral determinants
• Personnel not trained in the use of data sources & reporting forms
• Misunderstanding of how to compile data, use tally sheets, and prepare reports
4
• Math errors occur during data consolidation from data sources, affecting report
preparation
Organizational determinants
5
Section3: Health Data Quality Dimensions
6
3.1 Introduction to data quality dimensions
Regardless of whether in a hospital, health center, a clinic, or a health post, the quality of health
care data and statistical reports has come under intensive scrutiny in recent years. Thus, all health
care service providers, including clerical staff, health professionals, administrators, and health
information managers, need to gain a thorough knowledge and understanding of the key
components of data quality and the requirements for continuous data improvement.
Accurate data are considered correct: the data measure what they are intended to measure.
Accurate data minimize error (e.g., recording or interviewer bias, transcription error, sampling
error) to a point of being negligible.
The original data must be accurate in order to be useful. If data are not accurate, then wrong
impressions and information are being conveyed to the user. Documentation should reflect the
event as it actually happened. Recording data is subject to human error and steps must be taken
to ensure that errors do not occur or, if they do occur, are picked up immediately.
7
Question!
Give examples on accuracy and validity in both manual and electronic record system.
Example of accuracy and validity in a manual medical record system
o The patient’s identification details are correct and uniquely identify the patient.
o All relevant facts pertaining to the episode of care are accurately recorded.
o All patient/client records (Cards, forms) in the integrated individual folder are for the
same patient.
o The patient’s address on the record is what the patient says it is.
o Documentation of clinical services in a hospital or health center is of an acceptable
predetermined value.
o The vital signs are what were originally recorded and are within acceptable value
parameters, which have been predetermined and the entry meets this value.
o The abstracted data for indices, statistics and registries meet national and international
standards and have been verified for accuracy.
In a manual system, processes need to be in place to monitor data entry and collection to ensure
quality. In a computerized system, the software can be programmed to check specific fields for
validity and alert the user to a potential data collection error. Computer systems have in-built
checks such as edit and validation checks, which are developed to ensure that the data added to
the record are valid. Edits or rules should be developed for data format and reasonableness,
entailing conditions that must be satisfied for the data to be added to the database, along with a
message that will be displayed if the data entry does not satisfy the condition. In some instances,
the computer does not allow an entry to be added if it fails the edit. In other instances, a warning
is provided for the data entry operator to verify the accuracy of the information before entry.
8
• A laboratory value must fall within a certain range of numbers or a validity check must
be carried out.
• Format requirements such as the use of hyphens, dashes or leading zeros must be
followed.
• Consistency edits can be developed to compare fields – for example a male patient cannot
receive a pregnancy test.
2. Reliability (Consistency)
Data should yield the same results on repeated collection, processing, storing and display of
information. In other words, data should be consistent.
Four metrics of internal consistency are included in the DQR. These are:
1. Presence of outliers:
2. Consistency over time:
3. Consistency between indicators:
4. Consistency of reported data and original records:
Dimension 2.1.1: Presence of outliers: This examines if a data value in a series of values is
extreme in relation to the other values in the series.
9
Table 1: Internal consistency outliers
Definition
Metric Severity
National Level Regional Level
Month Total %
Woreda
Outliers Outliers
1 2 3 4 5 6 7 8 9 10 11 12
A 2543 2482 2492 2574 3012 2709 3019 2750 3127 2841 2725 2103 1 8.30%
B 1184 1118 1195 1228 1601 1324 1322 711 1160 1178 1084 1112 2 16.70%
C 776 541 515 527 857 782 735 694 687 628 596 543 0 0%
D 3114 2931 2956 4637 6288 4340 3788 3939 3708 4035 3738 3606 1 8.30%
E 1382 1379 1134 1378 1417 1302 1415 1169 1369 1184 1207 1079 0 0%
National 0 0 0 0 2 0 0 1 0 0 0 1 4 6.70%
The above table shows moderate outliers for a given indicator. There are four identified
moderate outliers. They are highlighted in red. Three of the woredas have at least one occurrence
of a monthly value that is a moderate outlier.
Nationally, this indicator is a percentage of values that are moderate outliers for the indicator.
The numerator for the equation is the number of outliers across all administrative units [in this
case, 4]. The denominator is the total number of expected reported values for the indicator for all
the administrative units. That value is calculated by multiplying the total number of units (in the
10
selected administrative unit level) with the expected number of reported values for one indicator
for one administrative unit. In this case, we have 5 woredas and 12 expected monthly reported
values per woreda for one indicator, so the denominator is 60 [5 × 12]. Thus, about 6.7% are
moderate outliers [4/60 = 0.0666 × 100, or 6.7 %].
Outlier for a certain indicator (%) = # of outliers across all administrative units
# Total number of expected report
Sub-nationally, see if you can calculate the number of outliers for each woreda. Count the
woredas where there are two or more outliers (for moderate outliers) among the monthly values
for the woreda [1]. Divide by the total number of administrative units [1/5 = 0.25 × 100 = 25%].
Dimension 2.2.2: Consistency over time: The plausibility of reported results for selected
programme indicators is examined in terms of the history of reporting of the indicators. Trends
are evaluated to determine whether reported values are extreme in relation to other values
reported during the year or over several years.
Definition
Metric
National Level Regional Level
Trends/ # (%) of woredas whose ratio
Conduct one of the following, based on indicator’s
Consistency of current year to predicted
expected trend:
over Time value (or current year to
(Analyze each average of preceding 3 years)
Compare current year to the value predicted from the
indicator is at least ± 33% of national
trend in the 3 preceding years
separately) ratio.
Graphic depiction of trend to determine plausibility
based on programmatic knowledge
11
Table 4: Example of trends over time
Ratio of % Difference
Year Mean of
2013 to between
Preceding 3
Woreda Mean of National and
Years (2010-
2010 2011 2012 2013 2010- Woreda
2012)
2012 Ratios
A 30242 29543 26848 32377 28878 1.12 0.03
Mean of preceding three years (2010, 2011, and 2012) is 93,774 [98,450 + 93,578 + 89,294)/3]
Ratio of current year to the mean of the past three years is 1.16 [108,459/93,774 ≈ 1.16].
The average ratio of 1.16 shows that there is an overall 16% increase in the service outputs for
2013 when compared to the average service outputs for the preceding three years of the
indicator.
Regionally, try to evaluate each woreda, by calculating the ratio of the current year (2013) to the
average of the previous three years (2010, 2011, and 2012). For example, the ratio for Woreda 1
is 1.12 [32,377/28,878].
Then calculate the % of difference between the national and woreda ratios for each woreda. For
example, for woreda A:
12
The difference between the woreda ratio and the national ratio for Woreda A is less than 33%.
However, there is a difference of approximately 44% for Woreda D between woreda ratio and
the national ratio.
To calculate this indicator sub-nationally, all administrative units whose ratios are different from
the country’s ratio by ±33%, or more are counted. In this example, only Woreda D has a
difference greater than ±33%. Therefore, 1 out of 5 woredas (20%) has a ratio that is more than
33% different from the national ratio.
Definition
Metric
National Level Regional Level
Maternal Health: ANC1 – Syphilis test # (%) of regional units where there
(should not be negative) is an extreme difference (≥ ± 10%)
Immunization: Penta3 dropout rate = (Penta1– # (%) of regional units with # of
Penta3)/Penta1 Penta3 immunizations >Penta1
(Should not be negative) immunizations (negative dropout)
Consistency HIV/AIDS: (HIV positive pregnant women – HIV
among # (%) of regional units where there
positive pregnant women who received ART)
related is an extreme difference (≥ ± 10%)
(Should not be negative)
indicators
TB: (TB treatment success rate –TB Cure Rate) # (%) of regional units where there
(Should not be negative) is an extreme difference (≥ ± 10%)
Malaria: # confirmed malaria cases reported - cases
testing positive # (%) of regional units where there
(should be roughly equal) is an extreme difference (≥ ± 10%)
13
Table 6: Example: Internal Consistency
% Difference between
Syphilis Ratio of ANC1 to
Region ANC1 National &Regional
test Syphilis test
Ratios
A 20995 18080 1.16 0.02
The annual number of pregnant women started on antenatal care each year (ANC1) should be
roughly equal to the number of pregnant women who receive syphilis test in ANC, because all
pregnant women should receive this test. First, we calculate the ratio of ANC1 to syphilis test
for the national level, and then for each woreda. At the national level, the ratio of ANC1 to
syphilis test is about 1.18 [78,477/66,548].
There is one woreda (D) with a ratio of ANC1 to syphilis test greater than 20%. We also see that
the % difference between the national and woreda ratios for woreda D is more than 10%.
At the regional level, we can calculate the ratio of ANC1 to syphilis test and the % difference
between the national and woreda ratios.
Dimension 2.1.4: Consistency of reported data and original records: This involves an
assessment of the reporting accuracy for selected indicators through the review of source
documents in health facilities. This element of internal consistency is measured by a data
verification exercise which requires a record review to be conducted in a sample of health
facilities. It is the only dimension of data quality that requires additional collection of primary
data.
14
HMIS can also be compared to pharmacy records or other types of data to ensure that the two
sources fall within a similar range.
Definition
Examples of
Indicators
National Level Regional Level
Ratio of facility # (%) of aggregation units used for the most recent population-
ANC1 coverage rates based survey, such as zone/state/region, whose ANC1 facility-
ANC 1st visit
to survey ANC1 based coverage rates and survey coverage rates differ by at
coverage rates least 33%
Ratio of Penta3
# (%) of aggregation units used for the most recent population-
coverage rates from
Penta3 based survey, such as zone/state/region, whose Penta3 facility-
routine data to
vaccine based coverage rates and survey coverage rates differ by at
survey Penta3
least 33%
coverage rates
• Population-based surveys: Demographic and Health Survey (DHS), EPI Cluster survey.
• Indicator values are based on recall, referring to period before the survey (such as 5
years)
• Sampling error: confidence intervals
Table 8: Example: External Consistency
NB: Comparison of HMIS and survey coverage rates for ANC1 Differences ≥ 33% are highlighted in red.
If the HMIS is accurately detecting all ANC visits in the country (not just those limited to the
public sector), and the denominators are accurate, the coverage rate for ANC1 derived from the
15
HMIS should be very similar to the ANC1 coverage rate derived from population surveys.
However, HMIS coverage rates are often different from survey coverage rates for the same
indicator.
At the regional level, the ratio of denominators is calculated for each administrative unit.
Woredas with at least 33% difference between their two denominators are flagged. Woredas C
and D have more than 33% difference between their two ratios.
• The adequacy of the population data used in the calculation of health indicators
The comparison of two different sources of population estimates (for which the values are
calculated differently) to see the level of congruence between the two sources
16
Table 9: External Comparison of Population Data
Definition
Metric
National Level Regional Level
NB: Comparison of national and regional administrative unit ratios of official government live birth
estimates. Administrative units with differences ≥ ±10% are highlighted in red.
The above table shows the ratio of the number of live births from official government statistics
nationally for the year of analysis to the value used by the selected health program.
Calculate the ratio of regional administrative unit 2014 live births to the value used by the
selected health program; woreda B has a difference of 0.17 or 17%.
3. Completeness
All required data should be present and the medical/health record should contain all pertinent
documents with complete and appropriate documentation.
17
Data Completeness on data recoding tools (Registers, cards/forms)
This refers all necessary data elements on registers/forms/cards should be filled immediately
after provision of the service by the care provider.
• The cover page of integrated individual folder should contain all the necessary
identifying data to uniquely identify an individual patient or client.
• For inpatients or clients received the service, the registers should contain all necessary
information’s accurately pertinent to the service provided and those include on registers.
• For all medical/health records, relevant forms are complete, with signatures and date of
attendance.
Completeness of reports (%) = # reports that are complete (all data elements filled out)
# Total reports available or received
The administrative unit will check all data element if they are left blank and take. Administrative
health unit can calculate the proportion of data elements with zero value in monthly/quarterly
service report from the total data element expected.
Explain!
N.B.: It is more important to calculate the content completeness for all reports separately to
identify which report type has the gap and act on it accordingly. Explain what this will tell us?
Report Completeness
This helps to examine the total reports received from all health facilities from the total reports
expected for a given period of time. All health posts and Health facilities are expected to send
18
monthly (service and disease report), every quarter (Quarter service report) and annual service
report once in a year. For HC and HSP with inpatient service IPD morbidity and mortality report
is expected in monthly base
19
4. Timeliness
5. Legibility
Examples of legibility
• Handwritten demographic data are clearly written and readable.
• Handwritten notes on patient form, admission card, and any other medical records
registers are clear, concise, readable and understandable.
20
• Handwritten National classification of diagnosis (NCoD) clear and easily understandable
to transcribe in to Register
In all medical/health records, cryptic codes or symbols cannot be used in either manual or
electronic patient records.
If abbreviations are used, they are standard and understood by all health care professionals
involved in the service being provided to the patient. Mostly this problem is seen at outpatient
and inpatient department which are major source for clinical data and NCoD.
6. Accessibility
All necessary data are available when needed for patient care and for all other official purposes.
The value of accurately recorded data is lost if it is not accessible.
Examples of accessibility
• Medical/health records are available when and where needed at all times.
• Abstracted data are available for review when and where needed.
• In an electronic patient record system, clinical information is readily available when
needed.
• Statistical reports are accessible when required for Performance monitoring team,
planning meetings and government requirements or for any official need.
7. Precision
This means that the data have sufficient detail. For example, an indicator requires the number of
individuals who received HIV counseling and testing and received their test results by sex of the
individual. An information system lacks precision if it is not designed to record the sex of the
individual who received counseling and testing.
8. Confidentiality
Confidentiality means that clients are assured that their data will be maintained according to
national and/or international standards for data. This means that personal data are not disclosed
21
inappropriately, and that data in hard copy and electronic form are treated with appropriate levels
of security (kept in locked cabinets and in password-protected files).
9. Integrity
Integrity is the quality of being honest and having strong moral principles or moral uprightness.
Data Integrity can be considered as a polar opposite to data corruption that renders the
information as ineffective in fulfilling desired data requirements. Data integrity is the opposite
of data corruption.
10. Relevance
22
Section 4: Data Quality Assurance
Section duration: 2 days
Section Objectives
Activities
Question!
Recap data quality dimensions discussed in Section 2.
23
4.1. Quality Assurance and Data Quality Assurance
Quality Assurance: A program for the systematic monitoring and evaluation of the
various aspects of a project, service, or facility (and taking actions accordingly) to ensure
that standards of quality are being met” (Merriam-Webster Dictionary)
Data quality assessments help to improve data quality by uncovering hidden problems in data
collection, aggregation, and transmission of priority indicator/data. Knowing about these
problems allows health professionals and managers to develop data quality improvement plan.
There are different techniques used at facility and administrative levels to show the level of data
quality and to take corrective measures.
The following methodology shall be applied to assure data quality at service delivery and
intermediate health administration units
• A desk review of the data that have been reported to national level whereby the quality of
aggregate reported data for recommended program indicators is examined using
standardized data quality metrics;
• Health facility assessment
o Data Quality Checks using LQAS method
24
o Other health facility assessments to conduct data verification and an evaluation of
the adequacy of the information system to produce quality data (system
assessment).
• Administrative health unit level data quality assessment
o Routine Data Quality Assessment (RDQA)
o Data Quality Audit (DQA)
o Performance of Routine Information System Management (PRISM
25
Major Differences among LQAS, DQA, RDQA and PRISM
26
Ethiopian Data Quality Assurance Timeline
Quarterly DV and
annual RDQA by RHBs,
and annual RDQA by
Routine Routine Routine Routine Routine
FMOH DQA DQA DQA DQA DQA
Bi-Monthly Supportive
Supportive Supportive Supportive Supportive
supervision visits by Supervision
Supervision Supervision Supervision Supervision
WorHO
wiwi
Monthly self-
assessment by LQAS LQAS LQAS LQAS LQAS
health facilities
27
4.2.1 Data Quality Desk review
Description
The desk review examines a core set of tracer indicators selected across program areas in relation
to these dimensions. The desk review requires monthly or quarterly data by sub national
administrative area for the most recent reporting year and annual aggregated data for the selected
indicators for the last three reporting years.
This cross-cutting analysis of the recommended program indicators across quality dimensions
quantifies problems of data quality according to individual program areas but also provides
valuable information on the overall adequacy of health-facility data to support planning and
annual monitoring.
The desk review compares the performance of the country information system with
recommended benchmarks for quality, and flags for further review any sub national
administrative units which fail to attain the benchmark. User-defined benchmarks can be
established at the discretion of assessment planners.
Who
FMOH and RHBs /ZHDs. This desk review is expected to be done at the M&E units and
feedback on the findings should be communicated back for further.
Frequency
WHO recommends that the data quality desk review to be conducted annually? As many of the
consistency metrics require annual data, the FMOH also recommends conducting this review
annually.
28
Data quality dimensions addressed
Dimension 2: Consistency
Dimension 3.1: Internal consistency of reported data; (except Consistency of
reported data and original records):
Dimension 3.2: external consistency
Dimension 3.3: external comparisons of population data
Dimension 3: Completeness (except Data Completeness on data recoding tools-Registers,
cards/forms)
Dimension 4: Timeliness
Data requirement
The desk review requires monthly or quarterly data by subnational administrative area for the
most recent reporting year and annual aggregated data for the selected indicators for the last
three reporting years.
Information on submitted aggregate reports and when they were received will be required in
order to evaluate completeness and timeliness of reporting.
Other data requirements include denominator data for calculating coverage rates for the selected
indicators and survey results (and their standard errors) from the most recent population-based
survey – such as the Demographic and Health Surveys (DHS) and immunization coverage
surveys.
How
Doing data quality desk review manually is very cumbersome as well as challenging and it also
needs advanced data analysis skills. The Ethiopian MOH has customized DHIS 2 to include
dashboards for analyzing and displaying the above stated data quality metrics. Detailed
discussion and hands on training will be provided under Section4.
Lot Quality Assurance Sampling (LQAS) - is a technique useful for assessing whether
the desired level of reporting accuracy has been achieved by comparing data in
relevant record forms (i.e. registers or tallies) and the HMIS reports.
29
Description:
It is a technique useful for assessing whether the desired level of reporting accuracy has been
achieved by comparing data in relevant record forms (i.e. registers or tallies) and HMIS reports.
The data that is compiled in databases and reporting forms is accurate and reflect no
inconsistency between what is in registers and what is in databases/reporting forms at facility
level. Similarly, when data entered in the computers, there is no inconsistency between reporting
forms and computer file.
The LQAS method will be used to check reporting accuracy at Health Facility level. The health
facilities will maintain a registry to record the data consistency check results and to look the
trend of the data quality improvement.
This is a method for testing hypothesis related with the level of HMIS data quality whether it is
achieved or not. It uses a sample size of 12 data elements and tries to check the reporting
accuracy.
If the number of sampled data elements not meeting the standard exceeds a pre-determined
criterion (decision rule), then the lot is rejected or considered not achieving the desired level of
pre-set standard. Decision rule table is used for determining whether the pre-set criterion is met
or not. Comparison of LQAS results over time can indicate the level of change.
Who
Health facilities (Hospital, health center and health posts).
Frequency
Monthly
How
Steps to carryout LQAS
Step 1 Decide the month for which you want to do the data accuracy check.
30
Step 2 Pre-fix the level of data accuracy that you are expecting, e.g. 85% or 90% etc.
Step 3 Put serial numbers against the data elements (not disaggregation) in the Service
Delivery or Disease Report that you want to include in the data accuracy check
Step 4 Generate twelve random numbers using Excel program. These random numbers
represent the serial numbers of the data elements included in the data accuracy
check. Note them in Column of the Data Accuracy Check Sheet. This is to ensure
representation of all data elements by giving equal chance to all data elements.
Step 5 List down the selected data elements from the report on to the Data Accuracy Check
Sheet in Column 2 and Column 3
Step 6 Write down the reported figures from the Monthly HMIS Report for the selected
data elements in the Column 4 of the Data Accuracy Check Sheet.
Note: In case of Health Post, figures for the selected data elements from the Tally
Sheet will be compared with recounted figures from the Family Folders. Therefore,
record the figures for the selected data elements from the Tally Sheet in Column 5
Step 7 Recount the figure from the corresponding registers and note the figures on Column
5 of the LQAS check-sheet
Step 8 If the figures for a particular data element match or do not match put “yes” or “no”
accordingly in Column 6 or Column 7 respectively.
Step 9 Count the total number of “yes” and “no” at the end of the table
Step 10 Match the total number of “yes” with the LQAS Decision Rule table and determine
the level of data accuracy achieving the expected target or not.
Questions
o In your view, what should be the desired HMIS data accuracy level?
o In order for the HMIS report to meet the desired accuracy level, how many data elements would
completely match? (Ask them to find the desired number of matches in the “Decision Rule”
table)
o How many data elements on the handout show that they match?
31
o What is the data accuracy level achieved?
o Does that level meet the desired data accuracy level?
o Invite questions from the participants and clarify accordingly.
Decision Rules for sample Sizes of 12 and Coverage Targets /Average of 20-95%
Sample Average Coverage (baselines)/Annual Coverage Targets (monitoring and Evaluations)
size <20% 20% 25% 30% 35% 40% 45% 55% 60% 65% 70% 75% 80% 85% 90% 95%
12 N/A 1 1 2 2 3 4 5 6 7 7 8 8 9 10 11
Decision Rules
32
The HMIS focal should do LQAS check by repeating the same procedure after having the
revised report. However, the first LQAS score should be reported in the monthly report format
and the health facility should keep the record of both LQAS accuracy sheet on PMT minute book
(Data quality Log book). The Health facilities should monitor the trend of LQAS across months
to see the changes overtime.
Please note that Health Facilities will maintain a registry to record the data accuracy check
results. The HMIS focal persons will also use it for recording the data accuracy check during
their supportive supervision visits.
Question?
What actions would be necessary if they find that the data accuracy at a health facility is not of
the desired level?
Step 1: Conduct data quality check at facility level. We are checking how many mistakes are
made during the transfer of data from registers to monthly reporting forms. Thus, you need
various registers, a monthly reporting form.
o Select randomly any 12 data points—with numbers-- from the monthly report form. Enter
them into the first column of the data quality check.
o Copy the number from the monthly report form into the second column of the data
quality checklist under the heading of monthly report.
o Calculate the total number of selected data items and enter that number into the third
column of the data quality checklist, under the heading register.
33
o If the numbers are same in columns 2 and 3, enter “yes” in column 4, otherwise “no.”
o Calculate total matched and mismatched numbers and write under row of total. Total
matched numbers are the accurate number.
It is a simple method used at health facility to check for consistency of reports before/after
conducting data entry. The PMT members sit together and look across each line and then from
top to bottom to identify missing data values, unexpected fluctuations beyond
maximum/minimum values, inconsistencies between linked data elements, and for
mathematical errors.
Examples:
• Family planning acceptors by age and method disaggregation
• Antenatal first attendance by gestational and age disaggregation
• Delivery attended by skilled health personnel vs Sum of still birth and live birth
Frequency:
Whenever report is generated
Data quality dimensions addressed:
- Presence of outlier
- Data completeness
- Internal consistency between indicators
RDQA is an assessment technique that can be used to self-assess and to monitor progress and
evaluate the RHIS status. Unlike to LQAS, the RDQA help the Health facilities and
34
administrative health units to verify reported data against to source documents and to look RHIS
system implementation. It is a simpler version of the DQA. Each level of the data management
system has a role to play and specific responsibilities in ensuring data quality throughout the
system. The RDQA tool should be applied regularly to monitor the trend in data quality. It is
recommended to be implemented quarterly by administrative health unit and Health facilities can
use for self-assessment purpose in a much-customized way.
Objective of RDQA:
By using the RDQA tool, we can achieve three main objectives.
1. Verify rapidly
o the quality of reported data for key indicators at selected sites;
o the ability of data management systems to collect, manage, and report good-
quality data
2. Implement
o corrective measures with action plans for strengthening the data management and
reporting system
o improving data quality
3. Monitor
o capacity improvements and performance of the data management and reporting
system to produce good-quality data
Importance of RDQA
1. Routine data quality checks as part of on‐going supervision
o Routine data quality checks can be included in already planned supervision visits at
the service delivery sites.
2. Initial and follow‐up assessments of data management and reporting systems
o Repeated assessments (e.g., biannually or annually) of a system’s ability to collect
and report quality data at all levels can be used to identify gaps and monitor necessary
improvements.
35
3. Strengthening program staff’s capacity in data management and reporting
o M&E staff can be trained on the RDQA and be sensitized to the need to strengthen
the key functional areas linked to data management and reporting in order to produce
quality data
4. Preparation for a formal data quality audit
o The RDQA tool can help identify data quality issues and areas of weakness in the
data management and reporting system that would need to be strengthened to increase
readiness for a formal data quality audit
5. External assessment by partners of the quality of data
o Such use of the RDQA for external assessments could be more frequent, more
streamlined and less resource intensive than comprehensive data quality audits that
use the DQA version for auditing.
Components of RDQA
RDQA tool has two key components, which are data verification and system assessment.
RDQA tool can be implemented at any or all levels of the data management and reporting
system, M&E Unit; intermediate aggregation levels (e.g. region and woreda); and/or service
delivery points.
36
RDQA tool has six parts that help to asses and improve the RHIS performance, which are data
verification, system assessment, interpretation of outputs, development of action plans,
dissemination of results, and on-going monitoring.
RDQA focus in two major assessment methods: 1) Documentation review -describe answering
yes/no questions to whether the source documents required for the assessment are available,
completed and within the required reporting period. 2) Data Verification -helps to check
whether the indicator of interest found in the periodic summary report against an alternative data
source. The degree to which the two sources match is an indication of good data quality.
Once the purpose of RDQA has been determined, the second step in the RDQA is to decide what
levels of the data management and reporting system will be included in the assessment (service
delivery sites, intermediate aggregation levels (e.g. regions, woredas), and/or the central M&E
unit.). It is not necessary to visit all the reporting sites in a given Program to determine the
37
quality of the data or how HIS system functions. Random sampling techniques can be used to
select a representative group of sites whose data quality is indicative of data quality for the whole
program.
38
p= the estimated proportion of data quality (If a previous study exists, p will be the accuracy
level of the indicator which provide the highest sample size or p will be 50% if no study exists)
z1-α/2 = the z score corresponding to the probability with which it is desirable to be able to
conclude that an observed change of size could not have occurred by chance (α= 0.05 (z1-α/2=
1.96) and from the precision or margin of error denoted by (s) found that 0.05.
If N (the total number of clusters or woredas) < 10,000, a correction formula will be used.
𝑛𝑓= n/1+ (n/N)
C. Determining the number of sites at Regional level:
The above stated sampling methodologies can be employed to select the appropriate number of
sites and clusters based on the objectives of the assessment. Precise estimates of data quality
require a large number of clusters and sites. Often it isn’t necessary to have a statistically robust
estimate of accuracy. That is, it is sufficient to have a reasonable estimate of the accuracy of
reporting to direct system strengthening measures and build capacity. A reasonable estimate
requires far fewer sites and is more practical in terms of resources. Generally, 12 sites sampled
from within 4 clusters (3 sites each) are sufficient to gain an understanding of the quality of the
data and the corrective measures required.
The Ethiopian MOH recommends the following sample size and methodology for RDQA
(especially for DV):
39
• Use census of all health centers and hospitals in the Woreda
D. Frequency
It is suggested that frequency of RDQA has to be based on the objective of the assessment and
the level of the organization conducting it. Accordingly, the data verification part has to be done
quarterly integrating it with supportive supervision visits by organizations at all levels; whereas
it is recommended that a comprehensive RDQA (Data verification and system assessment)
should be done annually by Federal or regional level coordinating bodies. It is also important to
clearly identify the reporting period associated with the indicator(s) to be assessed. Ideally, the
time period should correspond to the most recent relevant reporting period or schedule in HMIS.
Level Data Verification Full RDQA
FMOH Bi-annually Annually
RHB Quarterly Annually
WoHO Every two months NA
Health Facilities NA NA
Step 3: Selection of Indicators and data source
Determination of indicators and reporting period that should be included in the assessment is also
an important step in RDQA. It is recommended that a maximum of four indicators can be
included. More than four indicators could lead to an excessive number of sites to be evaluated.
The criteria for selecting the indicators for the RDQA could be the following:
1. Must review indicators: Indicators that should be selected first depending on the
indicator’s national and global importance/ priority.
2. Relative magnitude of the indicators: The amount of budget and activity associated with
the indicator(s).
3. Case by Case Purposive Selection: Indicators for which data quality questions exist and
the government wants to be routinely verified. Those reasons should be documented as
justification for inclusion.
Selected sites should be notified prior to the visit for the data quality assessment. This
notification is important in order for appropriate staff to be available to answer the questions in
40
the checklist and to facilitate the data verification by providing access to relevant source
documents.
The team should be seated with facility in-charge and other management members and explain
the objective of the assessment before starting the formal data collection. The data collecting
team may spend half day at one health facility just by filling the checklist. During the site visits,
the relevant sections of the appropriate checklists in the Excel file are filled out (e.g. the service
site checklist at service sites, etc.). These checklists are completed following interviews of
relevant staff and reviews of site documentation. The copy should be given for the facility to
look their gaps and take corrective measures even before the release of official report.
RDQA tool has 19 worksheets and the first two sections gives general information on how to use
the tool and the rest help in data collection and data analysis.
Tell participants that this training will focus on Data Verification component and a separate full
RDQA training will be provided for those who will be involved in the comprehensive assessment.
1) Data Verification
The purpose is to assess, on a limited scale, if service delivery and intermediate aggregation sites
are collecting and reporting data to measure the indicator(s) accurately and on time — and to
cross-check the reported results with other data sources. To do this, the RDQA will determine if
a sample of Service Delivery Sites have accurately recorded the activity related to the selected
indicator(s) on source documents.
41
The data verification exercise will take place in two stages:
1. In-depth verifications at the Service Delivery Sites;
1.1 Verify reported data against recounted from registers
Example
∑A / VF=
Indicators Description HF1 HF 2 HF3 HF4 HF5 HF6 HF7
∑B A/B
Recounted=A 10 50 70 20 30 40 20 240
ANC4 0.89
Reported=B 12 65 70 20 25 45 30 267
Recounted=A 25 45 30 12 20 10 0 142
Penta 3 0.83
Reported=B 38 59 30 16 15 13 0 171
Recounted=A 10 22 10 5 40 19 20 126
Currently on
1.94
ART
Reported=B 0 12 4 5 32 12 0 65
Recounted=A 20 55 34 14 45 25 27 220
Meseals 0.79
Reported=B 12 42 23 22 95 36 47 277
Recounted=A 41 71 29 78 9 1 12 241
TB all
1.14
forms
Reported=B 29 36 34 80 6 10 17 212
1.2 Verify the primary source of data (Medical records) against the secondary source of
data (registers): The purpose of this verification process is to measure the level of under
reporting by comparing data elements from medical records and registers. It is a method of
randomly selecting 10-20 medical records from the Card room and verifying if all the data
elements that are supposed to be recorded are captured in the register. We can summarize
the data for selected medical records as complete or incomplete based on the number of data
elements recorded for the latest visit that matched between the medical record and the
register.
42
Instructions
1. Randomly select 10 sample medical record numbers from the central register from the list
of patients who were seen in the last three days
2. Write the medical record number of each card in the first column
3. For each card, identify the latest visit date
4. Identify the register(s) based on the diagnosis (N.B check if the service delivery unit and
is written on the summary sheet)
5. Match if all the relevant data elements in the card are recorded in the register
• If all the data elements in the medical record are recorded in the register, mark
that card as “Complete” in the second column
• If not, Mark as “Incomplete”
6. Count the number of complete medical records and divide it with the total sampled
medical records.
7. Analyze the level of under-reporting based on the decision table below
<50% 50-75% 75-85% >85%
Catastrophic level of Severe under- Moderate level of Acceptable
under-reporting reporting under-reporting
43
1.3 Cross-check secondary data source (Registers) with the primary data source (Medical
records).
The purpose of this verification process is to measure the consistency of register and the
medical record. It particularly measures the level of over reporting.
Instructions
1. Select two core data elements from the sample indicators selected for data verification
2. From the respective register, select 5-10% of the total recoded data within the reporting
period
Example if total SBA recoded in the register is 200 will take 5% which is 10 to verify the
data at medical record room.
3. Take out the medical records for the sampled cases. To randomly select medical records,
divide the total number recorded by the required number of the sample (e.g. 10) to obtain
the sampling interval.
In this Example the sample interval will be 20 i.e. we will take every 20th client/patients.
4. Match the recorded data in the register against the medical record
a. If the recorded data element in the register is found in the medical record, mark
that card as “Matched”
b. If not, Mark as “Not Matched”. This also include if the medical record is not
physically available in the card room, it is also considered as not matched.
5. For each data element, analyze the level of consistency based on the decision tree below
<50% 50-75% 75-85% >85%
Catastrophic level of Severe Moderate level of Acceptable
inconsistency inconsistency inconsistency
44
Consistency Computing Sheet
Data Element for Medical record # Matched (recorded Not Matched (data
selected indicators data element in the element recorded in the
register is found in register is not found in
the medical record) the medical record or the
medical record is not
physically available )
ANC 4 00057 X
00119 X
00362 X
00007 X
00137 X
Total 4 1
SBA 00999 X
01120 X
01070 X
00082 X
02200 X
Total 3 2
Possible reasons for inconsistency between the register and the medical record.
1. Over reporting
2. Data falsification
3. Loss of medical record
4. Service provision without medical record
45
• The team should document basic demographic information (Name, Kebele, got
House number, phone number), date of the service provided, and type of service
provided before departure to household level verification.
46
• Review availability, completeness, and timeliness of reports from all Service Delivery
Sites. How many reports should there have been from all Sites? How many are there?
Were they received on time? Are they complete?
As part of the RDQA Assessment in Ethiopia, the FMoH would like to verify the data accuracy
and reporting performance of the Family Planning program. The indicator selected was
“Contraceptive Acceptance Rate.”
The Woredas and health facilities that were selected to be included in the RDQA assessment
were assigned across several assessment teams. Team #5 was responsible for conducting the
assessment at Endegagn Woreda Health Office in Gurage Zone.
Endegagn Woreda Health Office is expected to receive reports from 3 health facilities (1
primary hospital and 2 health centers) on a monthly basis. The reports should arrive by the
twenty-sixth day of the month. The reporting period selected for verification is December 2017.
Using the reports received (see below), verify the data and calculate the reporting performance at
the woreda level for the indicator “Contraceptive Acceptance Rate.” Please note that recounted
figures for the same period for Jane HC, Dinkula HC, and Dinkula hospital are 38, 62 and 80
respectively.
47
2) Systems Assessment
The purpose of the system assessment is to identify potential challenges to data quality created by the
data management and reporting systems at:
1. the service delivery sites, and
2. Any intermediary aggregation level (at which reports from service delivery Sites are
aggregated prior to being sent to the M&E Unit).
The system assessment has six areas to be checked at service delivery sites and
intermediate aggregation level:
1. M&E structure, functions and capabilities
2. Indicators definition and reporting guidelines
3. Data collection tools & reporting forms
4. Data management process
5. Links with national reporting system
6. Use of data for decision making
Although the system assessment identifies determinants of data quality, it also measures
some of the data quality dimensions. For example, confidentiality, legibility,
accessibility, and relevance are measured during the system assessment process.
Across the levels of the system, there are two key metrics we should know how to interpret
and use as we analyze our results and use them to create action plans for system
strengthening. Verification Factor (VF)
What it is The VF is the key metric for assessing the quality of the reported data,
by comparing the reported data to the source data (i.e., the register or
48
other HMIS record at the service delivery point)
What the scores Values>100%: Under-reporting, (i.e., recounted data from the
mean primary source document) is higher than the reported value. This
means the report says there were fewer services rendered than your
source document shows.
Dashboards
The RDQA tool is designed to produce outputs that facilitate analysis and use of the data to
understand the current status of the data quality for selected indicators and develop a targeted
action plan. When completed electronically, a number of dashboards produce graphics of
summary statistics for each site or level of the reporting system and a “global” dashboard that
aggregates the results from all levels and sites included in the assessment.
49
Sample Outputs
Summary Tables
To simplify the process of reviewing feedback from various sites or at various levels, the latest
version of the RDQA tool has been updated to include worksheets with tables that automatically
populate with the comments and remarks about the responses to the RDQA questions. The
RDQA workbooks summarize results for Data verification quantitative comments, System
assessment comments and detail of system assessment.
Based on the findings at each site the team will develop specific action plan at level and provide
feedback. In addition to this after reviewing the overall results the RDQA team should create
action plans to improve data quality and system assessment based on the objective of the study.
Engaging the team members, will create ownership of the plan and get the direct insights from
the people on the field. Decisions on where to invest resources for system strengthening should
be based on the relative strengths and weakness of the different functional areas of the reporting
system identified via the RDQA, as well as consideration of practicality and feasibility.
Table x: Frequency of data quality techniques applied by administrative unit level and health facility
50
Table XX: Data quality techniques applied and data quality dimension addressed by administrative unit level and health facility
51
Section 5: Using DHIS2 to improve data quality
Duration: 2 days
Objectives
DHIS2 to has several features that can help the work of improving data quality; validation during
data entry to make sure data is captured on the right format and within a reasonable range, user-
defined validation rules based on mathematical relationships between the data being captured
(e.g. subtotals vs totals), outlier analysis functions, as well as reports on data coverage and
completeness. More indirectly, several of the DHIS2 design principles contribute to improving
data quality, such as the idea of harmonizing data into one integrated data warehouse, supporting
local level access to data and analysis tools, and by offering a wide range of tools for data
analysis and dissemination. With more structured and harmonized data collection processes and
with strengthened information use at all levels, the quality of data will improve. Here is an
overview of the functionality more directly targeting data quality:
The most basic way of data quality check in DHIS2 is to make sure that the data being captured
is on the correct format. The DHIS2 will give the users a message that the value entered is not on
the correct format and will not save the value until it has been changed to an accepted value. E.g.
52
text cannot be inputted in a numeric field. The different types of data values supported in DHIS2
are explained in the user manual in the chapter on data elements.
To stop typing mistakes during data entry (e.g typing ‘1000’ instead of ‘100’) the DHIS2 checks
that the value being entered is within a reasonable range. This range is based on the previously
collected data by the same health facility for the same data element, and consists of a minimum
and a maximum value. As soon as the users enter a value outside the user will be alerted that the
value is not accepted. In order to calculate the reasonable ranges the system needs at least six
months (periods) of data.
The validation rules can be defined through the user interface and later be run to check the
existing data. When running validation rules the user can specify the organization units and
periods to check data for, as running a check on all existing data will take a long time and might
not be relevant either. When the checks are completed a report will be presented to the user with
validation violations explaining which data values that need to be corrected.
The validation rules checks are also built into the data entry process so that when the user has
completed a form the rules can be run to check the data in that form only, before closing the
form.
53
5.5. Outlier analysis
The standard deviation based outlier analysis provides a mechanism for revealing values that are
numerically distant from the rest of the data. Outliers can occur by chance, but they often
indicate a measurement error or a heavy-tailed distribution (leading to very high numbers). In the
former case one wishes to discard them while in the latter case one should be cautious in using
tools or interpretations that assume a normal distribution. The analysis is based on the standard
normal distribution.
Completeness reports will show how many data sets (forms) that have been submitted by
organization unit and period. You can use one of three different methods to calculate
completeness; 1) based on completeness button in data entry, 2) based on a set of defined
compulsory data elements, or 3) based on the total registered data values for a data set.
The completeness reports will also show which organization units in an area that are reporting on
time, and the percentage of timely reporting facilities in a given area. The timeliness calculation
is based on a system setting called Days after period end to qualify for timely data submission.
54
References
7. Anwer Aqil, Theo Lippeveld and DairikuHozumi (2009). PRISM framework: a paradigm
shifts for designing, strengthening and evaluating routine health information systems.
Health Policy Plan (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2670976/ )
55
Annexes
Annexes: Data validation template
Decision Rules for sample Sizes of 12 and Coverage Targets /Average of 20-95%
12 N/A 1 1 2 2 3 4 5 6 7 7 8 8 9 10 11
Across the levels of the system, there are two key metrics we should know how to interpret and use
as we analyze our results and use them to create action plans for system strengthening. Verification
Factor (VF)
What it is The VF is the key metric for assessing the quality of the reported data,
by comparing the reported data to the source data (i.e., the register or
other HMIS record at the service delivery point)
Scoring scale Scale: 0-200%
What the scores mean Values >100%: Under-reporting, (i.e., recounted data from the
primary source document) is higher than the reported value
This means the report says there were fewer services rendered than
your source document shows.
100%: Perfect data quality (exact match of recounted to reported),
which is rare.
Values <100%: Over-reporting (i.e., recounted data from the primary
source document) is lower than the reported value
This means the report says there were more services rendered than
your source document shows.
56
Acceptable values: For the purposes of the RDQA, 90-110% is
considered acceptable (within a 10% range of a perfect match).
Where you’ll see it in the Each of the dashboards for the individual sites and the summary
results dashboard will have a bar chart of the verification factors for each
indicator on the chart titled “Data Verifications.” You’ll see a band
that shows the acceptable range of 90-110%. Bars that fall outside of
this band indicate the site is over or underreporting.
System Assessment Score
What it is For each of the six dimensions of data quality, the RDQA tool has a
series of questions. The system assessment score for each dimension is
the average of the scores across the questions for that dimension.
This tells us the strength of the system for the individual dimensions,
which can help with identifying what the site is doing well and where
there are opportunities for improvements.
Scoring scale Scale: 1-3
The scores correspond to each of the responses in the system
assessment as follows:
1 = No, not at all
2 = Yes, partly
3 = Yes, completely
Then, for each component, the scores for each individual question are
averaged to create an aggregate score. The lowest possible aggregate
score is 1, meaning all questions had a “no” response for that
component; the highest possible aggregate score is 3, meaning all
questions had a “yes” response for that component.
What the scores mean The closer an aggregate score is to 3, the stronger the site or level of
the system is functioning for that component. The lower the score, the
poorer the performance.
Where you’ll see it in the Each of the dashboards for the individual sites and the summary
results dashboard will have a spider graph that shows the results of the
assessment for each of the M&E system components. Read on to learn
57
more about how to interpret this chart type.
Cross-Check Results
What it is Cross-checks compare a subset of units in your source data to a
secondary source. The value reported for your cross-check indicates
the percent of the source records you selected that were also reported
in the comparison document.
Scoring scale 0-100%
What the scores mean The lower the value, the fewer of your source records also appeared in
a second data source.
If you conduct the cross-checks with ~5% of your source records and
the cross-check value is <90% (more than 1 in 10 records was missing
in your secondary document), select another ~5% or 10 records
(whichever is greater) to add to your sample.
Where you’ll see it in the The cross-checks are an additional means of assessing data quality at
results the service delivery point and are included in the individual and
aggregate dashboards for the service delivery sites.
58
If no, determine how this
might have affected reported
numbers.
59