Chapter-2
Chapter-2
Outline
Applied Econometrics
1. Introduction
2. Econometric softwares packages -SPSS & STATA
Chapter 2
3. Data Processing & Analysis
Software Packages for Econometric Analysis ✓ Data Processing & Management
1 2
3 4
3 4
5 6
Dr Sisay Debebe 1
11/1/2024
Getting Started in Data Analysis using SPSS here is a brief description of SPSS…
Introduction
SPSS
• One of the most popular statistical packages
which can perform highly complex data
manipulation & analysis with simple instruction.
7 8
7 8
Getting Started in Data Analysis using STATA here is a brief description of STATA…
Introduction…
STATA
• It’s a multi‐purpose statistical package to help
you explore, summarize & analyze datasets.
9 10
Both SPSS & STATA softwares’ used for data Open SPSS folder crack folder readme
analysis in research- can be used: copy “License Authorization Wizard code” back to
◼ Processing questionnaires; setup folder setup next next accept term in
◼ Reporting in Tables & Graphs license agreement next next next next install
ok next pest the code next next finish
◼ Analyzing various statistical reports;
❖ Measures of central tendency
❖ Measures of dispersion & variation Copy STATA-15 folder & then pest on my document
❖ Inferential statistics- chi-square, t-test, F-test then open the folder click stataSE
❖ Simple correlation, cross tabulation &
❖ Econometrics model such as regression & much more panel You may create short-cut on the desktop,
and time serious models
11 12
11 12
Dr Sisay Debebe 2
11/1/2024
#. How to use variable view & data view for data definition?
13 14
13 14
• The Data View window • This sheet contains information about the data set that is
• Click on the tab labeled Variable View stored with the dataset
◼ This sheet is visible when you first open the Data Editor &
this sheet contains the data
◼ Rows are cases
◼ Columns are variables
Click
15 16
15 16
• Name • Name
• Click on the ‘Name’ box. • Provide short-hand form of the variable to be entered
• Space is not allowed.
• Give each variable a valid name (8 characters or less
with no spaces or punctuation, beginning with a letter
not a numeric number).
• Short, easy to remember word names.
• Each variable name must be unique; duplication is not
allowed.
• Variable names are not case sensitive: the names
NEWVAR, NewVar, and newvar are all considered
identical.
17 18
17 18
Dr Sisay Debebe 3
11/1/2024
• Type • Width
• Click on the ‘type’ box. The two basic types of variables – Width allows you to determine the number of
that you will use are numeric & string. characters
• This column enables you to specify the type of variable. – SPSS will allow to be entered for the variable
19 20
19 20
• Decimals • Label
– Number of decimals – You can specify the details of the variable
– It has to be less than or equal to 5 – You can write characters with spaces up to 256
characters
3.14159265
21 22
21 22
23 24
23 24
Dr Sisay Debebe 4
11/1/2024
25 26
25 26
• Measure is the type of measurement sale The process of assigning numbers to objects in such
away that specific properties of the objects are
truthfully represented by specific properties of the
numbers.
Types of scales
a) Nominal
b) Ordinal
c) Scale
▪ Interval
▪ Ratio
It’s important factors to consider before data entry
27 into SPSS 28
27 28
⚫ Catholic
⚫ Others
29 30
29 30
Dr Sisay Debebe 5
11/1/2024
31 32
31 32
Ratio scale measurement is like interval scale Interval scale measured on continuous scale &
measurement, but this one has an absolute zero has no true zero point
point, which means that all mathematical It measured data on continuous scale can have
operations are possible. no true zero point
Ratio data measured on a continuous scale & does It’s without a zero point time.
have a zero point
Example: Example:
▪ Age Time moves along a continuous measure – minute or
second & so on
33 34
33 34
The default is scale, which refers to an interval 1. Choice the proper scale for the following variables?
or ratio level of measurement a. Religion
b. Weight
c. Height
Choose nominal for categorical data & ordinal if d. Income of the household
your data involve ranking or ordered values; e. Marital status
f. Postal code
g. Celsius
h. Kelvin
35 36
35 36
Dr Sisay Debebe 6
11/1/2024
Defining Variables: Data coding & coding the Defining Variables: Data coding & coding the
questionnaire questionnaire …
Translate the questionnaire into codes & enter in Question response format:
SPSS ❖ Open-ended questions ( in word or number)
The questions in the questionnaire are mapped into ❖ Close-ended questions
variables in SPSS ✓ Dichotomous & categorical question
✓ Multiple-choice questions
Important factors to consider before data entry into
SPSS ❖ Scaled-response questions
◼ Question response format
◼ Scale characteristics
◼ Level of measurement
37 38
37 38
Use Exercise –II word format & define the There are 2 ways to enter data into SPSS:
questionnaire 1. Directly enter into SPSS by typing in Data View
2. Enter into other database software such as
In SPSS, open variable view, define the
Excel then import into SPSS
questioner on SPSS
Let’s start with the 1st option, using data view &
In STATA, open data editor, define the insert data on SPSS
questioner on STATA
Use data editor & insert data on STATA
39 40
39 40
In SPSS, go to:
▪ Entering Variable Information in the "Variable File, Open, Data
View“ Select Type of file (for example, Excel) you want to open
Select File name you want to open
In SPSS, open Data View, insert data for 15
interviewer randomly- just to practice how to enter
data
41 42
41 42
Dr Sisay Debebe 7
11/1/2024
43 44
43 44
Data processing & management contributes In processing data, the researcher employs
to improving data quality. several activities.
Data processing may start while the survey The main ones are the following:
work is on progress or immediately after I. Data editing
field data collection is completed. II. Data entry
III.Data analysis
Data processing should start as soon as possible
after data definition
45 46
45 46
Data editing: which includes, Data entry: this is digitizing the data to
◼ data cleaning, computing software such as SPSS, excel,
◼ coding & STATA etc, using a computer.
data verification.
◼
It is critical that the highest level of
accuracy need to be maintained when
inputting data.
One way of ensuring the accuracy of data
entry is through double entry.
47 48
47 48
Dr Sisay Debebe 8
11/1/2024
49 50
Data Problems & Way out B. Data Problems & Way out
#. What are the main data problems? Data can be analyzed using a variety of ways.
#. How can we detect data Problems? However, before data are analyzed it should be
#. How to deal with data problems? properly inspected
Because, data analysis with problematic data
leads to biased results & misleading inferences
⚫ A first look at the data
⚫ Types of data problems
⚫ Dealing with data problems
51 52
51 52
What are the main data problems? How Can We Detect Data Problems?
Despite great care in research project But any way, data problems can be detected at
design, empirical researchers may face various stages of data processing & analysis
various data problems. ⚫ Data cleaning/coding,
1) Missing data ⚫ Data entry (out of range values, missing values)
2) Distributional outliers ⚫ Data analysis (descriptive/analytical)
3) Data inconsistency Some data problems are easy to detect while
Example: others are difficult.
⚫ Daughter’s age > mother’s age For instance, detecting missing data is straight
forward while identification of data
⚫ employee’s salary < minimum wage
inconsistency is very difficult [it requires
⚫ negative value for a distance variable
53 logical thinking]. 54
53 54
Dr Sisay Debebe 9
11/1/2024
Dealing with Distributional outlier & its Dealing with Distributional outlier & its
way out? way out?...
An observation which compared to other The ultimate solution to deal with an outlier
observations has either too high or too low observation is either to drop an observation
values for one or more variables is regarded as an corresponding to an outlier data point or to drop
outlier or outlying observation. the variable containing an outlier from the analysis.
Before trying to find out a solution it is essential Researchers use alternative strategies to use
to examine the main cause of an outlier observation/s with an outlier data point/s.
observation. ✓ Transforming outlier information reduces the impact
of an outlier on sample estimates.
In principle an outlier may be a genuine ✓ Try a regression with & without transforming an
observation or simply the result of a measurement outlier data.
error.
55 56
55 56
57 58
57 58
59 60
59 60
Dr Sisay Debebe 10
11/1/2024
61 62
61 62
63 64
63 64
Dr Sisay Debebe 11