Econ 656 - Research Methods V - 2023
Econ 656 - Research Methods V - 2023
Seminar in Economics
Part V
Data Processing and Analysis
1
Content of the Lecture
1. Introduction
2. Data Preparation and Processing
3. Analysis of Data
1. Univariate analysis
2. Bivariate analysis
3. Multivariate analysis
2
Introduction
Once data is acquired you will need to use it to help you
address your research questions.
For the data to be meaningfully used you need to:
Ensure that the data is complete.
Know your data - becoming familiar with what you have got.
Organize your data .
Analysis is the most rewarding part of your research project.
There is a sense of relief, excitement and satisfaction that
your work is meaningful.
3
Introduction
It is the process of working with the data to describe, discuss,
interpret, evaluate and explain it in terms of the research
questions or hypothesis.
4
Introduction
Much of the quantitative data analysis is conducted using
software programs.
So, the collected data must be converted into a machine-
readable, numeric format.
Numerical data can be analyzed quantitatively using statistical
tools in two different ways.
6
Data Preparation and Processing
Data processing starts with editing, coding, classifying and
tabulating the collected data.
7
Data Preparation and Processing
Two levels of editing: field and central levels.
Central editing: takes place when all forms have been completed
and returned to the office.
Data editors correct obvious errors such as entry in
wrong place, recorded in wrong units, etc.
8
Data Preparation and Processing
Checking questionnaires: Identifiers
Each questionnaire or case needs a unique identifier.
10
Data Preparation and Processing
What to do with partial responses – missing responses
If a questionnaire is only partially completed there may
be a number of reasons for this.
12
Data Preparation and Processing
Inconsistent data
Sometimes you will find that the information given by a
respondent within a questionnaire is inconsistent.
This can be the case with both factual and value data.
13
Data Preparation and Processing
This type of inconsistency could have occurred for a number
reasons.
14
Data Preparation and Processing
As with missing information, you will need to consider
whether:
the data can be checked in some way (by referring to other
questions) or by contacting the participant); and
whether the data is useable in your analysis.
16
Data Preparation and Processing
Coding: Many data collection instruments include open
questions.
i.e., questions that do not have a preset range of
answers.
In order to be able to work with this data using statistical
analysis the data from open questions need to be coded.
Coding refers to the process of assigning numerals to
answers so that responses can be put into a limited number
of categories or classes – coding sheet.
17
Data Preparation and Processing
It is the process of converting data into numeric format.
This enables you to enter the data quickly using the
numeric keypad on your keyboard and with fewer
errors.
Coding is especially important for large complex studies
involving many variables and measurement items.
18
Data Preparation and Processing
The coding must be:
Exhaustive - there must be a class for every data item.
19
Example Example
You can consider
each of the listed
foods as a variable
and code each
variable as 1 if it is
ticked, 2 if it is not
ticked.
20
Data Preparation and Processing
Data entry: Coded data can be entered into a spreadsheet,
database, text file, or directly into a statistical program like
Stata or SPSS.
21
Analysis of quantitative data
22
Analysis of quantitative data
Analysis is a process of summarizing, describing and
explaining the data in terms of the research questions or
hypothesis.
23
Analysis of quantitative data: Univariate analysis
With respect to the number of variables three types of statistical
analysis could be considered:
Univariate analysis: only one variable
Bivariate analysis: two variables
Multivariate analysis: more than two variables
Univariate analysis refers to a set of statistical techniques that
can describe the general properties of one variable.
25
Analysis of quantitative data: Univariate analysis
The distribution or the ‘shape’ of your data, can also be
depicted in the form of graphs and charts.
Bar charts and histograms can help you to visualize the shape
or distribution of the values for each of your variables.
Graphs are effective ways for summarizing your data and
helping you to identify interesting or anomalous features
within the data
27
Analysis of quantitative data: Univariate analysis
Measures of central tendency: a value typical for the data
The mean, median and mode are methods of
summarizing the data relating to one variable.
29
Scatter plot of a positive association
Income and livestock ownership
60
50
Livestock
40
30
20
10
0
0 200 400 600 800 1000 1200
Income
30
Scatter plot of a negative association
Income & illitracy rates (%)
Rate of illiteracry (%)
100
80
60
40
20
0
0 200 400 600 800 1000 1200
Income
31
Scatter plot of no association
Income and household size
12
10
hh size
8
6
4
2
0
0 200 400 600 800 1000 1200
income
32
Analysis of quantitative data: Bivariate analysis
Correlation analysis : The most common bivariate statistic is the
bivariate correlation which is a number between -1 and +1.
II I
Mean y
III IV
Mean x
33
Analysis of quantitative data: Multivariate analysis
Multivariate analysis: the relationship between three or
more variables
Some of the relationships identified in bivariate analysis can
be spurious - when there is no real relationship
Analysis should control for the effects of additional variables
34
Analysis of quantitative data: Multivariate analysis
General Linear Model: Most statistical procedures are derived
from a general family of statistical models called the general
linear model (GLM).
Yi = β0 + β1*X1 + β2*X2 + … + βn*Xn + εi
Yi = β0 + βi∑Xi+ εi
36
Analysis of quantitative data: Multivariate analysis
How are the parameters (βi) estimated?
The widely used method is ordinary least squares (OLS)
37
Analysis of quantitative data: Multivariate analysis
Various tests can be organized.
Overall test (F-test): the null hypothesis for the overall test
is ‘all the coefficient of the regression are zero?’ (no
explanatory power)
Ho: β1 = β2= β3 = … = βn = 0
38
Analysis of quantitative data: Multivariate analysis
Several Econometric problems are also expected.
Sample Selectivity
Misspecification
Omitted Variables
Fixed Effects
Endogenous Variables
39