Lecture 1-Course Overview and Introduction of Econometrics
Lecture 1-Course Overview and Introduction of Econometrics
Tingting Wu
• About me
• Ph.D in Economics
Universitat Autònoma de Barcelona and Barcelona School of Economics
Research Fields:
Behavioural and Experimental Economics, Applied Microeconomics, Public Economics
As an Economist, I’m interested in
• Behavioural Economics
=> Economics + Psychology + Econometrics
• Relaxes rationality assumptions used in traditional models
• Accept bounded rationality due to limited information, limitation of cognitive capability, emotions,
and cultural and social factors.
• Study how agents deviate from the equilibrium and identify the causes for such deviations with the
use of observational data
• Experimental Methodologies
• Testify hypotheses through observing human subjects’ decision-making process in controlled experiments.
Class
◆Class participation:
100% face-to-face class with no hybrid/virtual arrangements
No recordings for later reviewing
◆Core references:
Introductory Econometrics: A Modern Approach, 7th edition, South-Western. Woolridge J.M. (2019)
(Old versions also good, 5th ed in Canvas)
Using R for Introductory Econometrics, 2nd edition, Florian Heiss (2016) (E-book link)
u Software:
All empirical practices are computed with R: R and R studio
Assessment
1. Quiz: 30%
2. Group Project: 30%
3. Final Test: 40%
“Experience has shown that these three view-points, that of statistics, economic theory,
and mathematics, is a necessary, but not by itself a sufficient, condition for a real
understanding of the quantitative relations in modern economic life. It is the unification
of all three that is powerful. And it is this unification that constitutes econometrics.”
Why Study Econometrics
Can you tell why these comments are wrong?
“ A low
rel
observed ation in SAT sco
. So SAT re
achievem score is n and GPA in colle
or shows of batting, ent in co ot a good ge is
“A .3 hitter made three po llege.” measure
to predic
so he is gonna make a hit.’ - the offi
ce of XX
t the
ry –
- Baseball game commenta academi
c affairs
-
• Econometrics
Ø helps you to quantify “how much” and identify causality that you developed from
economic theory by using data (empirical evidence).
Ø helps you to develop “Intuition” and “Sensitivity” about how things work through data.
X Y
Hypothesis Factor Relationship
Target Variable
Regressor between
Dependent Variable
Explanatory Variable X and Y
Independent Variable
<-Median(Q2)
<-Q1
Interquartile Range = 3rd quartile - 1st quartile
<-min
• Summarize the data with the min, the 1st, 2nd, 3rd quartile, and the max
• Sometimes the 10% and 90% are used instead of min & max
How John analyze?
• Qualitative Variable : not a number by itself, but become a random variable after coding it
with a number
e.g.) marital status (single, married, widowed, divorced, separated)
e.g.) employment status (full time, part time, Temporary)
e.g.) dwelling (HDB 1b, HDB 2b, HDB 3b and more , Condo, Landed Property)
=> most demographic/ socio-economic variables or other categorical variables.
14/23
What Econometrics can do?
• Econometrics
• Econometrics fills the gap being of “students of economics” and being a “practicing economist”.
15/23
Steps of Econometrics analysis
Note that there exists a large variety of econometric models and model choice depends very
much on the research question, the underlying economic theory, availability of data, and the
structure of problem.
16/23
Reading
Two paragraphs from “The Myth of Excess Enrollments in College-Becker”
The print media and the blogosphere have many discussions of the high and rising cost of attending
college, the debt burden weighing on college students, the fact that the average real earnings of college
graduates have risen only slowly during the past 40 years, and the difficulty young college students are
having in getting good jobs. The conclusion frequently reached is that too many high school graduates
seek a college degree, and that these graduates have been sold a bill of goods about the value of a
college education.
The facts cited are generally correct, but the conclusion about the low value of going to college is
completely wrong. The fallacy stems from believing that earnings from a college education determine
the benefit of a college education. The truth is that the benefit is determined by the earnings from a
college education relative to what earnings would be if a person stopped her education after high
school. A first approximation to this gain in earnings for the typical person is given by the difference
between the average earnings of college and high school graduates.
Questions:
• Find arguments/statements that could be verified or supported with Econometrical approach.
• Mark Independent Variables and Dependent Variable from such statements.
• Pick one statement, what dataset is needed to testify this statement.
• Any qualitative variables needed?
• Whether other factors, i.e., emotion, social norms can be useful to explain the statements?
Economic model
An economic model
Ø has to reduce the complexity of reality such that it is useful for answering the question of
interest;
Ø is a collection of cleverly chosen assumptions from which implications can be inferred
(logically);
Ø should be as simple as possible and as complex as necessary;
Ø cannot be refuted or “validated” without empirical data of some kind.
The example from John, we assume that the earned money of agent i is a function of how
cock this agent is , then we have:
Where
= the money that agent i earns
= the hormone level of agent i
Econometric model
Once we specified an economic model, we need to turn it into econometric model for John,
Population Sample
Parameter Statistics
Inference
A process to find the characteristics of the population
using information from the sample
Representativeness of the Sample Data
◆Importance of the stochastic process of the Sampling
Question.
After hearing that your uncle earned a big return from his stock investment, you asked your father
if he has invested any in the stock market. He said “Unfortunately, No.”
Disappointed, you wanted to estimate how much he could have earned by now if he had invested
just some of his money in the stock market 20 years ago.
How will you estimate the potential rate of return from the past 20 years of investment?
• What data to collect?
• How to analyze?
Representativeness of the Sample Data
To estimate the potential rate of return from the past 20 years of investment, you did the following
step by step:
1. You randomly selected 50 stocks
2. You back-tracked their historical price at 20 years ago
3. You then calculated the rate of return by comparing with their current price.
4. Also, you calculated the weighted average rate of return of these 50 randomly chosen stocks.
Will this estimate accurately represent the actual rate of return from past 20 years of investment?
Or, will this over/underestimate the actual rate of return?
What do you think and why?
Representativeness of the Sample Data
Explain if there was any issue in this sampling process?
Experimental data
• Captured through observation of human behavior in designed experiment.
• Collected through active intervention by the researcher to produce and measure change or to create difference
when a variable is altered.
• Often reproducible or replicable, but may be expensive to do so.
• Allow researchers to determine a casual relationship and it is typically projectable to a larger population.
Common Types of Research data
Simulation data
• Generated by imitating operation of a real-word process or system over time using computer test models.
• Used to try to determine what would, or could, happen under certain conditions.
• The test model is often as, or even more, important than the data generated from the simulation.
• For example, to predict weather conditions, economic models, chemical reactions. (Estimated data)
Derived/compiled data
• Involves using existing data points, often from different data sources, to create new data through some sort of
transformation, such as an arithmetic formula or aggregation.
• This type of data is usually reproducible or replicable once the raw data exists.
• For example, combining area and population data from Singapore to create population density data.
Various Types of Economic Data Sets
Individuals (Person, household, firm, or other economic agent)
Sections Data
• cross-sectional data
• Cross sectional + Time series
allowing for different
• A set of individuals surveyed
individuals from
repeatedly over time
different times.
Types of Data: Cross-sectional data
u Cross-sectional data refer to observations of many different individuals (subjects, objects) at a given
time. Each observation belonging to a different individual.
u Examples:
Ø A. Gross annual income level of 10 randomly chosen households in Singapore from the year 2022.
Ø B. Cross-sectional data you see in daily life: Oil prices across country.
(A) (B)
Types of Data: Time series data
u Time series data is a sequence of data points indexed in time order.
u Time series data is for studying how past events influence future events and lags in behavior.
u Examples:
Ø A. Gross annual income level of 1 randomly chosen households in Singapore from the year 2001 -2022.
Ø B. Time series data you may from newspaper: Crude oil prices over time.
(A) (B)
Types of Data: Panel data
u Panel data is a dataset in which the behavior of entities (e.g., individuals, countries, companies) are
observed across time. Panel data is also known as longitudinal or cross-sectional time-series data.
u Examples:
Ø A. Gross annual income level of 2 randomly chosen households in Singapore from the year 2001-2022.
Ø B. Panel dataset that you may see in newspaper: Oil production by country over time
(A) (B)
Types of Data: Pool Cross Section Data
u Pool Cross-section data is a collection of cross-sectional data, however, allowing for changing units
across time.
u A pooled cross section us usually analyzed much like a standard cross section data, except that we need
to account for secular differences in the variables (not specific individuals) across the time.
u Examples: Gross annual income level of 9 randomly chosen households in Singapore from the year 2019
and one randomly chosen new household in Singapore from the year 2020.
Preparation: Install R in your computer (Or other statistical software that you prefer)
The Hitchhiker’s Guide to Econometrics