Data Quality Training
Data Quality Training
All the features and characteristics of a data set that affect its
ability to be useful to its intended purposes, e.g., for analysis.
Completeness
• Represents complete list of records (eligible persons, facilities, units) and fields in
each record are provided.
Reliability
• Complete and accurate, consistently measure the intended indicator, and not subject
to inappropriate alteration over time.
Precision
• Sufficient detail provided, e.g., can be disaggregated by gender, age, etc.
Dimensions of Data Quality
Timeliness
• Up-to-date (current) and information is available on time.
Integrity
• System used to generate data is protected from deliberate bias or
manipulation for political or personal reasons.
Confidentiality
• Clients assured their data will be maintained according to national and/or
international standards for data protection.
Issues with Data Quality
Systematic Errors
o Programming mistakes
o Bad definitions for data types or models
o Violations of rules established for data collection
o Poorly defined rules
o Poor training
Random Errors
o Keying errors
o Data transcription problems
o Illegible handwriting
o Hardware failure (e.g., breakdown or corruption)
o Deliberately misleading statements by patients or care providers
Data Quality for SRHR research
Initial examination
o Missing data
o Repeated numbers
o Grouping data over many months (smoothing)
o Outliers
o Internal validity (logic)
Routine examination
o All above data
o Timeliness
o Random data quality checks
Addressing Issues with Data Quality
Missing Data and Outliers
Documentation Review
Secondary
Summary forms (MoH tools),
Source electronic databases (DHIS2),
(sub-county and and project templates.
county level)