0% found this document useful (0 votes)
15 views11 pages

S15 Data Editing and Coding

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 11

18-May-20

Dr. Rohit Vishal Kumar


Associate Professor, IMI Bhubaneswar

Raw Data

Editing of Data
Data
Cleaning Coding of Data

Data File for Analysis

Data Analysis Software / Manual Analysis

Analysis
Plan
Descriptive Univariate Bivariate Multivariate
Analysis Analysis Analysis Analysis

1
18-May-20

 Raw Data:
 The unedited response from a respondent exactly as indicated by that respondent

 Non Respondent Error:


 Errors which a respondent is not responsible for creating; such as when it is made by the
interviewer

 Data Integrity:
 Refers to the notion that the data file actually contains the information which the researcher
promised to the decision maker.
 Alternatively it can be looked upon as that the data contained are true and accurate representation
of respondent’s view

 The process of checking the completeness, consistency and legibility of data and making
data ready for coding and transfer to storage

 Can be broadly classified into:


 Field Editing
 In-House Editing

 Checking done for


 Transference errors
 Units of measurement errors
 Item non-response (unanswered questions or partially answered questions)

2
18-May-20

 How long have you lived at the current address? 48


 What is your age? 32 years

 Does your organisation has more than one computer network?


 Yes  No
If “Yes” How Many? 3

Ethical Issue:

Should we “plug in” or “impute” the data that is missing? Will it lead to better answers of bias
the answer and findings

3
18-May-20

4
18-May-20

 Is the process of assigning a numerical score or other character symbols to previously edited
data
 Codes, refer to numerical symbols assigned to pieces of information

We shall learn about Coding by using a Small Questionnaire

Codes are Transferred into a Data-File

 Is to understand how to convert a hard copy of the questionnaire to the electronic file
suitable for analysis

 We shall talk about:


 How to translate the Questionnaire into Codes
 How to get these codes into a common base file [Delimited File]
 How to get it into SPSS or some other program from Analysis

 Fully Practical Session


 “We will prepare a sample data-file fit for analysis”

Please Fill in the Questionnaire given to you

10

5
18-May-20

 Dichotomous
 Only One Response Possible
 The least difficult to deal with

 Multiple Choice
 One or more than one response possible
 Somewhat difficult to deal with

 Open Ended:
 Responses are recorded verbatim
 Most difficult to deal with

11

 Dichotomous:
 Only one column needed in MS-Excel

 Multiple Choice
 Can be looked upon in 2 different ways
 A series of Dichotomous Questions
 A set of Responses
 SPSS uses MR-Group to handle these kinds of responses

12

6
18-May-20

 Go through the open-ended questions in about 10-20


questionnaires
 Identify key outputs
 Start numbering the key outputs - [Known as Post-Code
Number]
 If an idea is similar to an identified key output put it under
the same Post-Code Number
 Else create a new Post Code Number
 Keep a track of Post Code Numbers
 Take a call whether some of the post codes can be merged or
not
 Now write the Post Code numbers on the questionnaire and
enter the data
 Essentially reduce the responses to “a set of responses” and
then use MR-Group fro analysis

13

14

7
18-May-20

 SPSS considers a full row as a “Single Case”


 SPSS considers a full column as a “Single Variable”
 A Variable can be of the Following Type:
 Nominal
 Ordinal
 Scaled

 A data is considered to be of type “scale” if it is on interval or ratio scale


 A Variable Name can be of 8 characters
 It can have a long descriptive name called label

15

Case

Variable

Current Cursor Position

16

8
18-May-20

 Click on the “Variable View” Tab


 Set the desired properties in the tab
 The Key Properties are:
 Name: Represents the variable name. 8 characters
 Type: Represents the type of data: 8 Different Types
 Width: The total number of places
 Decimal: The number of decimal places to be shown
 Label: The Long Descriptive Name [OPTIONAL]
 Value: Value Names of the Variable [NOMINAL & ORDINAL]
 Missing: Value used to represent Missing Value
 Columns: The number of columns to use
 Align: Alignment [LEFT, CENTER, RIGHT]
 Measure: The Scaling Type [NOMINAL, ORDINAL, SCALED]

17

18

9
18-May-20

 Nominal or Ordinal Data can have different levels


 For Example:
Excellent 1
Very Good 2
Good 3
Average 4
Bad 5

Value Names Variable Values

 Take help of Data -> Define Variable Properties or use the questionnaire

19

 NEVER WORK ON THE ORIGINAL DATA FILE

 Keep the original data-file safe


 Email it to yourself
 Keep it on a CD/ Flash Drive and put it in the locker
 Used Dropbox or Google Drive

 Never truncate your data-file


 Statistical softwares cannot read truncated data files
Can be completely ignored

 Deal with missing values


May be Ignored
 Missing Completely at Random (MCAR)
 Missing at Random (MAR)
 Not Missing at Random (NMAR) Cannot be Ignored

20

10
18-May-20

21

11

You might also like