Guidlines Collectingdataviaexcel
Guidlines Collectingdataviaexcel
Templates
QHS Guidelines for Collecting Data via Excel Templates Page 2 4/262016
1.0 Introduction
This document describes how to optimally develop a data base in MS Excel. With these guidelines
This document is broken down into three main areas; philosophy of developing a database, general layout and data
collection considerations. A Checklist is included in Appendix A. Appendix B gives a list of other documents you
may find helpful. They are available on the QHS website http://www.mcw.edu/Quantitative-Health-
Sciences/Resources.htm.
2.0 Philosophy
2.1 Development of framework for database
When developing a database there are several general considerations:
• For all variables “-9” indicates “missing” and “-8” means “not applicable.”
• ID: 4-digit number.
• Sex/gender: M=Male, F=Female
It’s possible to set up the data dictionary to be used for data validation as well. See section 4.3.
QHS Guidelines for Collecting Data via Excel Templates Page 3 4/262016
2.4 Monitoring the contents
Reports and checks can improve data quality. In developing a monitoring plan the following should be decided:
It is always worthwhile consulting at the beginning with your statistician and data manager expert on these
details.
QHS Guidelines for Collecting Data via Excel Templates Page 4 4/262016
3.0 Setting up a general template
3.1 Merged cells only for non-data collection
Merged cells should only be used for formatting purposes (i.e. within a non-data collection section of the template).
For example, merging the cells within the table header would be fine, but having merged cells in the active area of
the form where data is being collected would not be fine. During the analysis phase any cells that have been merged
that lie within the active data collection area of the form will need to be unmerged, which has the potential for data
loss.
To rename a tab in an Excel file click on the “Format” drop-down menu located in the main ribbon. Then select
“Sheet”, and then “Rename”. To add a new worksheet right click on any of the tab names, followed by clicking on
“Insert” in the pop up menu, then select “Worksheet”. Excel will add another worksheet to the book.
QHS Guidelines for Collecting Data via Excel Templates Page 5 4/262016
3.3 Need to use header rows
All templates must have a header row(s) clearly identifying what data will be stored in each column of the Excel
file. It is alright to change the alignment of the text in the cell to conserve space. To change the alignment right
click the cell and select “Format Cells” and then click on the “Alignment” tab of the pop up form. The “orientation”
will change the alignment of the text while the “Horizontal” and “Vertical” text boxes will change the positioning of
the text in the cell.
Text alignment
For ease of use during data entry the header row(s) can be split into its own pane allowing users to scroll through the
spreadsheet and still be able to match up data cells with the header information. To split a heading row click the
split button located on the upper right hand of the worksheet and drag to the desired location.
QHS Guidelines for Collecting Data via Excel Templates Page 6 4/262016
4.0 Data collection form considerations
Using the data dictionary the following practice should be observed.
4.2 Variables
A variable is a characteristic of a unit being observed that may assume more than one of a set of values to which a
numerical measure or a category from a classification can be assigned (e.g. income, age, weight, etc., and
“occupation”, “industry”, “disease”, etc.).
• The variables should be aligned with a header column in the template and only hold one value. See section
2.2 for how to make sure to do this.
o Any time multiple pieces of information make up what we think of as one piece of information
(i.e. blood pressure) it needs to separated out into its distinct pieces and stored separately in
different variables.
o An example of a common variable that appears to be one unit of data is date/time. A date/time
should really be stored in two separate variables, one for date and one for time.
See section 4.2.2 for more details on how variables hold data.
4.2.1 Identifiers
Each individual record needs to have a unique identifier that is specific to just that record.
Identifiable data such as MRN numbers should not be used as the identifier in the final dataset.
To create a basic study identifier in Excel that can be used for each new record that is added, type the following
formula into the cell following the first entered identifier value, and then copy and paste the formula. More
elaborate identifiers can be created, contact QHS for assistance.
Excel formula Copy & paste formula Results of the copy & paste
QHS Guidelines for Collecting Data via Excel Templates Page 7 4/262016
4.2.2 Categories
There are two data storage categories for variables and they function differently during the data analysis phase of the
study. There are two data storage categories for variables and they function differently during the data analysis
phase of the study.
• The Mutually Exclusive category describes variables where there is only one value being collected (i.e.
“Select one of the following” types of questions). Examples:
o Yes or No questions
o True or False questions
o Select one response from multiple values (i.e. 1 = low, 2 = Medium, 3 = High)
• The Non-mutually exclusive category describes the “Select all that apply” type of questions.
o For example, a patient’s tumor was detected via any or all of the following: physician exam;
radiographic imaging; or laboratory evaluation. In this case, three variables need to be created with
1=Yes or 0=No to indicate if the given method was used.
Q4. Tumor_detected_lab
Q4.Tumor_detected_pe
Q4. Tumor_detected_ri
Study
ID
999 0 1 1
1000 1 1 0
1001 1 1 1
• For questions where “Other” is a choice, whether in a mutually exclusive or non-mutually exclusive format,
the variable should not be an open ended text box. Instead have two variables, one for “Other” and one
where a specific response can be typed in.
Valid
Header Variable Note
Value
Q4. How was the Tumor Detected; Choice Physical Exam Q4tumor_pe 0 No
Q4. How was the Tumor Detected; Choice Physical Exam Q4tumor_pe 1 Yes
QHS Guidelines for Collecting Data via Excel Templates Page 8 4/262016
The next step is to define a “Name” for the valid values.
• Select the cell or range of cells in the Valid Value column that apply to the variable in question;
• Click the “Name” box at the left end of the formula bar;
• Type a name for the cells and press enter;
QHS Guidelines for Collecting Data via Excel Templates Page 9 4/262016
• Specify the location of the list of valid entries by clicking the Source bar and then highlighting the cells of
that contain the valid values, or type in the valid list name into the Source box;
QHS Guidelines for Collecting Data via Excel Templates Page 10 4/262016
The Valid Values List is now active in the selected cell. Use Excel’s copy and paste command to update the rest of
the column
The Restriction is now active in the selected cell. Use Excel’s copy and paste command to update the rest of the
column.
QHS Guidelines for Collecting Data via Excel Templates Page 11 4/262016
4.6 Linking spreadsheets in a workbook
Being able to type in the study id on the first Excel-based data collection form and having it auto-populate on any
additional forms in the workbook is one common application for linking spreadsheets in a workbook. See the
example below using form Visit and form Enrollment below:
• Click on the cell in the other spreadsheet in the workbook that the cell is going to be linked to (form
Enrollment).
• Press enter on the keyboard (back to form Visit).
• Once the initial link is set, the Copy & Paste function can be used to link additional cells.
Excel will go back to the first cell, showing the value of cell from the other worksheet, as well as a formula in the
formula bar.
Select the cell on the visit form Select the cell on the Enrollment Excel takes us back to the Visit form,
That needs to be linked and type form that should be linked to, notice the formula in the formula bar.
“=”. press enter.
QHS Guidelines for Collecting Data via Excel Templates Page 12 4/262016
APPENDIX A: Excel Document Checklist
.
QHS Guidelines for Collecting Data via Excel Templates Page 13 4/262016
APPENDIX B: Resources
Checklists
Documents
Brochures
QHS Guidelines for Collecting Data via Excel Templates Page 14 4/262016