0% found this document useful (0 votes)
52 views77 pages

Data Management

Trains staff on use of questionnaire Monitor: Reviews completed questionnaires for quality and completeness Data Manager: Designs data entry screens and validations based on questionnaire Statistician: Reviews final questionnaire for analysis Tsegaye Hailu Questionnaire Design Guidelines 1. Simple, easy to follow logical flow 2. Unambiguous questions and responses 3. Avoid leading questions 4. Use of standard terms and units 5. Skip patterns and conditional questions 6. Space for comments 7. Version control and dates 8. Subject identification 9. Signatures and dates Tsegaye Hailu Questionnaire Design Guidelines

Uploaded by

Jemal Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views77 pages

Data Management

Trains staff on use of questionnaire Monitor: Reviews completed questionnaires for quality and completeness Data Manager: Designs data entry screens and validations based on questionnaire Statistician: Reviews final questionnaire for analysis Tsegaye Hailu Questionnaire Design Guidelines 1. Simple, easy to follow logical flow 2. Unambiguous questions and responses 3. Avoid leading questions 4. Use of standard terms and units 5. Skip patterns and conditional questions 6. Space for comments 7. Version control and dates 8. Subject identification 9. Signatures and dates Tsegaye Hailu Questionnaire Design Guidelines

Uploaded by

Jemal Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 77

Data Management Training

April, 2017

1
Part-I
 Data Management
Part-II
 Data Capturing Tools

2
Data Management overview outline
 Definitions and Principles
 Unforeseen Problems and Solution Tools
 DM Process and Questionnaire Development
 Questionnaire Handling
 Procedures for Completion of Questionnaire
 Archiving of Questionnaire

Tsegaye Hailu
Definitions and Principles
“Data management” is a general term covering
procedures both for:
– the collection of data at study sites and
– the quality control of those data before and after they
have been submitted to a statistical analysis or
coordinating centre.
Data management includes all aspects of data
planning, handling, analysis, documentation and
storage, and takes place during all stages of a study.
Data management team is responsible for producing
high quality databases containing high quality data
meet operational, clinical and regulatory requirements
4
WHY COLLECT THE DATA ?

What Do you think is the Importance of


Collecting the Health Data or Study in general?

Tsegaye Hailu
WHY COLLECT THE DATA ? (1)
To meet objectives of study & health and patient
management strategy

Hopefully study will cause improvement in:

Government policies on the TB /HIV strategy etc.

Reduction in transmission (spread) of the disease

Therefore reduction in disease and mortality


Tsegaye Hailu
WHY COLLECT THE DATA ? (2)
It is very important therefore to pay attention to:

• Data Content

• Accuracy of the Data

• Relevance of the Data

• Completeness of the Data

Tsegaye Hailu
To Achieving Quality of Data:

All Health/Study data/information should be:


- Recorded
- Handled
- Stored
In a way that allows its accurate reporting,
interpretation, and verification.

Leading to:
Credible health research data.
Tsegaye Hailu
When does Data Management Begin?
Begins with the overall planning process of the research
/ survey, or whatever the purpose is.

Hence, Data Management Team should be involved in


the overall planning of the research or survey.

Will help important decisions on:


Rate of data collection
Rate of data processing
Project Time table
Calculation of Budget
Tsegaye Hailu
When does Data Management Begin?
Contributing to Protocol Writing
Contributing to Questionnaire Development
Contributing to Database Design and Writing of Data
Dictionaries
Writing of Data Management and other SOPS
Data Entry, Validation, Cleaning and Archiving of
Research Data & Documents
Preparing and doing data for Statistical Analysis
Completion of Research / Study Report

Tsegaye Hailu
Data Management motto: G I G O

GARBAGE IN GARBAGE OUT

• It means that if your data is of poor quality then it


is certain to be the case that the results of any
analysis will be unreliable.

• One of the major roles of Data Management is to


minimise error at all stages of the study and not
just at the computing stage.

Tsegaye Hailu
PROBLEMS

WHAT POSSIBLE
PROBLEMS
CAN YOU ENVISAGE?
Tsegaye Hailu
UNFORESEEN PROBLEMS
Data and/or software occasionally gets corrupted for
some unknown reasons.

Hardware problems include computer breakdown


(hard disk) or CD or USB containing backup data or
struck by Lightening.

Printer breakdown or faulty (more annoying during


report production.

Tsegaye Hailu
UNFORESEEN PROBLEMS cont’d

Power cuts or UPS breakdown – can lead to loss of data


(an hours or days or a weeks or months work).

Flooding or Fire outbreak destroying forms

Termites eating questionnaire forms

Data security and confidentiality broken.

Tsegaye Hailu
FIELD AND DATA ENTRY PROBLEMS
Difficulty getting exact date from Subjects.
Dates: _ _ /01/ 98, 1 month ago, weeks ago.

D.O.B and/or Age not entered;


waste time if poorly filled
Weeks for days, months for weeks vice versa.

Missing values though box provided;


don’t know whether question was asked or not.
Y or N unanswered does not mean answer is NO.
Tsegaye Hailu
FIELD AND DATA ENTRY PROBLEMS

Some result units recorded per uL others per L

Condition is jump yet people fill in next question

Consistency e.g. 1st Name before 2nd Name not mixed


up. 1 for Y, 2 for N not mixed up

Writing eligibly

Tsegaye Hailu
How Can Data Management Solve this Problem?

Use tools at its disposal to prevent as much as


it possibly can, the unforeseen problems.

“PREVENTION IS BETTER THAN CURE”

Tsegaye Hailu
TOOLS FOR THE PROBLEMS (1)
DETECTING ERRORS IN DATA
Manual Checking: Manually going through forms
Interviewers (F. Workers), Data supervisors (if any)

Checking During Data Entry: Check files or programs


written; (Interactive checking)

Checking After Data Entry: (Batch checking)

Validation (and/or Verification)

Tsegaye Hailu
TOOLS FOR THE PROBLEMS (2)
DETECTING ERRORS IN DATA

VERIFICATION:
• Used to ensure that data entered is actually data
on the questionnaire. This is normally
accomplished by double entry (entry by two
different clerks).

• Verified data does not necessarily mean accurate


data. If data is invalid from the field it will be
verified correctly using double entry but will still
remain invalid. Tsegaye Hailu
TOOLS FOR THE PROBLEMS (3)
DETECTING ERRORS IN DATA

VALIDATION: A means of ensuring that data entered


into the data file is valid according to some criteria
arrived at by an expert in the field.

Tsegaye Hailu
TOOLS FOR THE PROBLEMS (4)
DATA PROCESSING PROCEDURES

Good Reception and storage of questionnaires.


Ensure Not damaged or dirty

Good Data Entry – good choice personnel &


training & software

Verification or Double Entry (if used)


Tsegaye Hailu
TOOLS FOR THE PROBLEMS (5)
DATA PROCESSING PROCEDURES

Frequent checking and/or editing of data in


preparation for analysis

Frequent Backing up of data (having copies


elsewhere)

Archiving of data
Tsegaye Hailu
DM: ISSUES IN D’VELOPING COUNTRIES

Questionnaire designs, Data capturing tools, Storage,


archiving not optimal and need improvement

Our dates different from rest of the world. Need


synchronizing dates with other countries
Using Not Licensed software for entering and analyzing
data.

No formal training of staff latest database technologies


and statistical software's

Tsegaye Hailu
CONCLUSION

Data management plays the singular and central role


as the link between all aspects and disciplines of any
project from the field work to the laboratory, clinic,
regulatory bodies, sponsors and the statistics.

Do not start any study without first consulting a Data


Management Unit,.
additional slide on
“ DM Process and Questionnaire Development Next “

Tsegaye Hailu
DATA MANAGEMENT IN OR FORMS

Questionnaire / Form:
Participants’ data are collected in the study questionnaire with
unique identifiers on each form and specimen label.

Data management and quality assurance:


Raw data from questionnaires are double entered with
programmed computer checks to identify data entry errors.

Responsibility:
It is the investigators’ responsibility to ensure accuracy,
legibility and completeness of data entry in the questionnaire
and in all other required report forms and logs.
Tsegaye Hailu
DATA MANAGEMENT PROCESS

Protocol pQES development Documentation

on going eQES
Data Entry Instructions
design and validation

Qu
Data Entry erie
s Monitor
-----------------
ries
Data Validation Que Investigator

Clean File

Statistical Analysis

Completion of OStudy
QES Archiving
Report
Tsegaye Hailu
Purpose of Designing Questionnaire
Collects relevant data in a specific format
in accordance with the protocol
compliance with regulatory requirements; IEC/IRB

Allows for efficient and complete data processing,


analysis and reporting

Facilitates the exchange of data across projects


and organizations esp. through standardization.

Tsegaye Hailu
Questionnaire Relationship to Protocol

Protocol determines what data should be collected


on the questionnaire

All data must be collected on the questionnaire if


specified in the protocol

Data that will not be analyzed should not appear


on the questionnaire /CRF

Tsegaye Hailu
Questionnaire Development Process
Designer: Drafts questionnaire from protocol

Reviewers: questionnaire review meeting; comments


back to designer

Designer: Finalizes and prints questionnaire for use

Key identifying information: MUST HAVES


Study Number
Site/Center Number
Subject identification number
Tsegaye Hailu
Questionnaire Development cont’d
Guidelines(SOP will be shown)
• Collect data with all users in mind

• Collect data outlined in the protocol

• Be clear and concise with your data questions

• Avoid duplication

• Request minimal free text responses


Tsegaye Hailu
Questionnaire Development cont’d
Guidelines (cont..)
• Provide units to ensure comparable values

• Provide instructions to reduce misinterpretations

• Provide “choices” for each questions


• allows for computer summarization

• Use “None” and “Not done” where appropriate

Tsegaye Hailu
Poorly Designed Questionnaire
• Data not collected

• Collected too much data – Wasted resources in


collection and processing

• Database may require modification

• Data Entry process impeded


• Need to edit data

• Target dates are missed


Tsegaye Hailu
Guidelines for filling CRF
Have clear instructions on how to complete
• All sections must be completed at the time of subject visit
or as soon as results are available
• All entries must be attributable, accurate, legible and
complete

• In BLACK BALL POINT PEN

• Incorrect entries are crossed out with a single line, dates


and initialed, and explained where necessary

• No “WHITE OUT”, Erasers, Ink pen, Pencils

Tsegaye Hailu
Submitting Questionnaire
“The investigator should ensure the accuracy completeness,
legibility, and timeliness of the data reported to the sponsor in
the CRFs (Questionnaire) and in all required reports.”
ICH-GCP 4.9.1

Common errors in submitted questionnaires include:


corrections not dated / initialed
incorrect data
dates in wrong format
missing entries
wrong units
use of wrong colour pen
data for wrong subject
missing signatures
lack of consistency between parts of the CRF
Tsegaye Hailu
DM AND Questionnaire (3)
QES HANDLING: Reception and Data Entry (I)
• DM/DEC check number of forms and corresponding PINs
in Log book.

• DM/DEC signs for number of forms and date received.


Field Co-ordinator signs to log out the forms.

• Data entered using agreed software.


– Double Entry is done by two data entry clerks using two
computers.

Tsegaye Hailu
Questionnaire Safety and Precautions
•Keep questionnaires in a well-protected location.

•Do not give questionnaires to study participants.

•Store questionnaires binders in metal cabinets.

•Only authorized study personnel should have access to


questionnaires .

Tsegaye Hailu
PROCEDURE
1. State of the Questionnaires:
• Verify that each questionnaire page conforms to the
procedures to be performed on that study day.

• Each time a questionnaire page is completed, verify


that it corresponds to the correct study participant and
record the participant study number onto all CRF pages .

• Verify that all questionnaire pages are present

• Verify that questionnaire pages are not damaged


Tsegaye Hailu
PROCEDURE 2 CONT’D

Missing information should be recorded in the entry


field as follows:

NA = data not applicable


ND = evaluation or assessment were
applicable but not done
NK = information requested is unknown

Tsegaye Hailu
PROCEDURE 3 CONT’D

Use the 24-hour clock to record time (i.e. 22:40 instead


of 10:40 PM). One day encompasses 00:00 (midnight)
to 23:59 (one minute before midnight of the next day).
A time specification of 24:00 is invalid.

Numbers should be right-justified and recorded using


leading zeros when necessary.

Numbers should contain zeros in the tenths, hundreds


and thousands column as necessary (i.e. there should
not be blank fields).
Tsegaye Hailu
PROCEDURE 4 CONT’D

• Record numbers using decimals only (i.e., not fractions).

• Values should be recorded using the units specified on


the questionnaire .

• Data reported on the questionnaires that are derived


from source documents should be consistent with the
source documents or the discrepancies should be
explained.

Tsegaye Hailu
MAKING CORRECTIONS:
a. Authorized actions:
• Cross out the wrong entry with a single line
• Write the correct entry alongside/above/under the
wrong entry
• Initial the correction
• Date the correction

d. Prohibited actions:
 Use the correcting fluids
 Erasing or overwriting entries
 Intentionally entering false data
 Illegible entries

Tsegaye Hailu
Examples of Data onto CRFs
1. Entering the Data:

Specify: This is text


– The text field, represented by a thin line. This
is where you can enter text
Temperature: o
C
– The numerical data field. Use these fields to
enter numerical data. Enter leading zeros of
there are extra boxes.

Tsegaye Hailu
Examples of Data onto CRFs 1 cont’d
__/___ 2004
dd mm yyyy
– The data field. Please record the date in the
European format (i.e. day/ month/year).

___ ___:___ ___


24 hours
– The time field. Please record the time in 24
hour clock format.

Tsegaye Hailu
Examples of Data onto CRFs 1 cont’d

2. Correction procedure
_08_/_05_/2016 TH
09/05/2016

If an error has occurred, please use the following correction


procedure:
· Draw a single line through the error, so the original data
can still be seen
· Write the correct value next to the original entry
· Initial and date the correction

Tsegaye Hailu
Data Entry and Validation
Data processing errors are errors that occur
after data have been collected.2 Examples of
data processing errors include:
Transpositions (e.g., 19 becomes 91)
• Copying errors (e.g., 0 (zero) becomes O)
• Coding errors (e.g., a racial group).
Routing errors (e.g., the interviewer asks the wrong
question or asks questions in the wrong order)
• Consistency errors (contradictory responses, such as
the reporting of a hysterectomy after the respondent
has identified himself as a male)
• Range errors (responses outside of the range)
Tsegaye Hailu
Data Entry and Validation cont’d
To prevent such errors, you must identify the stage
at which they occur and correct the problem.
Methods to prevent data entry errors include:
Manual checks during data collection (e.g., checks
for completeness, handwriting legibility)
• Range and consistency checking during data entry
(e.g., preventing impossible results, such as ages
greater than 110)
• Double entry and validation following data entry
• Data analysis screening for outliers during data
analysis
Tsegaye Hailu
DM AND Questionnaire (6)
BACK-UPs & ARCHIVING :
• Back-up of data entered should be on DM’s computer/
CD /USB at end of each day/ week/month appropriately.

• Questionnaires after entry should be filed and kept in a


locked cabinet. DM & PI keep keys to the cabinets.

• Final Cleaned Data are sent to Institute’s or designated


Statistician for analysis and presentation.

• At the end of the study, all questionnaires should be


archived by Investigator and/or
Tsegaye Hailu
Sponsor
Data Capturing Tools

48
Data Capturing Tools outline

Web based system


 RedCap
 OpenClinica
 Personal Assistant Device(PDA)
Stand alone Database
 Microsoft Access
 SPSS
 Epiinfo
 EpiData
49
Redcap
• REDCap was developed by an informatics team at
Vanderbilt University with ongoing support from NCRR
and NIH grants
• Research Electronic Data Capture
• browser-based
• designed to address common problems for academic
biomedical researchers hoping to use electronic databases
• And also clinical & translational research databases
• widely used in the academic research community
• Expensive for investigator-initiated studies or other such
studies at a smaller scale
50
REDCap provides
• user-friendly
• Web-based case report forms
• Real-time data entry validation (e.g. for data
types and range checks)
• audit trails, and the ability to set up a calendar to
schedule and track critical study events such as
blood-draws, participant visits, etc.
• Also, designated users can assign different levels
of access for each member of the research team.
51
Advantages of REDCap:

• Secure and web-based. Input data from anywhere in the world with secure web authentication,
data logging, and Secure Sockets Layer (SSL) encryption.

• Fast and flexible Conception to production-level database in less than one day.
• Multisite access. Projects can be used by researchers from multiple sites and institutions.
• Fully customizable. You are in total control of shaping your database or survey.
• Advanced question features. Auto-validation, branching logic, and stop actions.
• Mid-study modifications. You may modify the database or survey at any time during the study.
• Data import functions. Data may be imported from external data sources to begin a study or to
provide mid-study data uploads.

• Data comparison functions. Double data entry / Blinded data entry.


• Export survey results to common data analysis packages. Export your
data to Microsoft Excel, SAS, STATA, R, or SPSS for analysis.

• Save your survey or forms as PDFs. Generate a PDF version for printing in order to collect
52
Login interface

53
After login

54
Different Features

55
OpenClinica
• The world’s most widely-used,
open-source software for clinical research
• 1st released in 2005
• Designed to meet the diverse needs of
modern research environments
• Built as a lightweight, extensible, and modular
application
• Web brower

56
Important Features of OpenClinica
• Organization of research by study protocol and site.
• Dynamic generation of web-based CRFs in portable Excel
templates.
• Management of longitudinal data for recurring patient visits
• Data import/export tools for migration of study datasets.
• Interfaces for data query and retrieval across subjects, time, and
clinical parameters
• Compliance with regulatory guidelines e.g. 21 CFR Part 11
• Built on robust and scalable technology infrastructure interoperable
with relational databases
57
Login Interface

58
After login and different features

59
After login and different project

60
Working with OpenClinica
• Policy determination needed
• Required human and material resources allocated
• When know-how is established, utilization requires only
5 main steps:
– Designing
– Creating CRF’s
– Event definitions
– Data Entry
– Data Extraction

61
Designing CRFs
Done in excel using a blank CRF template
provided by Openclinica

62
Uploading CRFs

The excel sheet is then uploaded onto


OpenClinica

63
Event Definition

64
Data Entry

65
Epidata

EpiData is a windows based program for:


 Design data structures
 Simple data entry
 Entering data and applying validating principles
 Editing / correcting data already entered
 Asserting that the data are consistent across variables
 Printing or listing data for documentation of error-
checking and error-tracking
 Comparing data entered twice
 Exporting data for further use in statistics
66
EpiData Main Features
• Questionnaire design
• Make data file
• Check
• Data entry
• Data export
• Documentation

67
How to work with EpiData?
Work Process toolbar“

Define Data

 Point at “Define data” part and “new qes file”


 Save the empty file and give it the name first.qes

 Write now in the Epi-Editor the lines shown:


 Explanation: Each line has three elements:
68
EpiData(1)

1. Write now in the Epi-Editor the lines shown:


Explanation: Each line has three elements:
A. Name of variable (e.g. v1)
B. Text describing the variable
(e.g. sex or "day of birth")
C. An input definition, e.g. ## for two digit
2. Save the file again as done in point 1
3. Now preview the data form

69
EpiData(2)
Close the form as well as the Epi-Editor
Proceed to next section
Create DataFile
Accept the ”first.qes” and ”first.rec” names
for "make datafile“
Data form saved as first.qes
Data file which will contain the data, saved as
first.rec.
70
Add checks of Data Entry
Click Add checks of Data Entry
Add checks specify rules for data entry

• range 10-80 plus single value 99


• Jumps: On value 1 goto s2: 1>s2

71
Now add value labels to a variable

72
Data Entry
• Continue with Enter Data
• Simply activate the Enter data on the toolbar
and accept first.rec for data entry
• Double Entry of Data
Toos->
prepare
double
data entry
73
Export, Analysis and options
• Export to any data format

74
Data management and analysis using stata

• Running Stata
• Stata windows shown below

75
Data management using stata(6)
• Simple linear regression – regress, rvfplot,
other diagnostics
• Correlation – corr, spearman, ktau – I tend not
to use corr because of the sensitivity to the
normality assumption for tests and confidence
intervals
• Only pwcorr and not corr provide test of
significance

76
THANK YOU

77

You might also like