Data Practices
Data Practices
– Avoid duplication
1-unmarried, 2-
ms Marital status married,3- widow, 4- Numeric 1,4 Nominal
seperated/divorced
number of cigarettes
smknum Numeric Discrete
smoked per day
MICROSOFT EXCEL
Department of Biostatistics, St.John's
16
Medical College, Bangalore
Errors in data entry
• Transposition (ex: 39 becomes 93)
• Check your date formats. It may look right in excel, but it will
be imported according to the internal representation, which may
not be in right format
– You can use DD/MM/YYYY
• Keep variable names short & unique. Start with a letter and use
only letters, numbers, & underscore. No spaces. LowerCase is OK
Questionnaire
design and entry
– Free program
– Based on EpiInfo
– Windows format
– Creating questionnaire
– Simple analysis
Department of Biostatistics, St.John's
25
Medical College, Bangalore
EpiData (II)
• .QES file
- Questionnaire
• .REC file
- Actual data
• .CHK file
- Any defined checks
• Preview questionnaire
-(click Make data file ► preview data form)
Three sections:
• No underspace
• <#>, <##.#>
Text:
• Information of text and/or numbers
• No mathematical operations
• <_>
Department of Biostatistics, St.John's
37
Medical College, Bangalore
Other variables
• Boolean variables:
- These are logical variables; there are only two possible answers: yes / no
- <Y>
• Date variables:
- Hold information on dates
- <DD/MM/YYYY>
- System variable:
Today date: date of the data entry
- <Today-dmy>
• Soundex:
- Coding of words (anonymous, e.g. A-123)
- code to limit orthographic errors (e.g. Rome and Roma)
- <S >
• Auto identification number (system variable)
- Counts the records entered
- <IDNUM>
• Jumps
– Jumps is used to skip the current fields that
are not applicable
• Must Enter
– Data must be entered in field
• Repeat
– Show data from previous record
• Value Label
– Click “+” to add label
– Add text to explain label values
– Look at the screen forms as shown
• Advanced
– To prevent duplication in unique ID
(identification ) number, you can do as
• Delete records:
- Click cross to delete
- Record marked for deletion, but can be recovered
• File Structure
• Data entry notes (.NOT file)
- Use to write comments during data
entry
- e.g. difficult to read handwriting etc
• View Data
• List Data
• Codebook
- Basic descriptive statistics on all
variables
• Validate duplicate files
- Check consistency after double entry