0% found this document useful (0 votes)
38 views

Lecture 1-Course Overview and Introduction of Econometrics

This document provides an overview of an introductory econometrics course. It introduces the lecturer, Dr. Tingting Wu, and their background and research interests in behavioral economics and experimental methodologies. It outlines the class structure, including participation, references, software used, and assessment breakdown. A tentative schedule of topics is also provided.

Uploaded by

yen
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
38 views

Lecture 1-Course Overview and Introduction of Econometrics

This document provides an overview of an introductory econometrics course. It introduces the lecturer, Dr. Tingting Wu, and their background and research interests in behavioral economics and experimental methodologies. It outlines the class structure, including participation, references, software used, and assessment breakdown. A tentative schedule of topics is also provided.

Uploaded by

yen
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 34

Lecture Notes 1

Course Overview and Introduction to Econometrics

Tingting Wu

National University of Singapore


Lecturer
• Dr. Tingting Wu
• Department of Strategy and Policy, NUS Business School
• Office: BIZ2, 03-35
• Email: [email protected]

• About me

• Ph.D in Economics
Universitat Autònoma de Barcelona and Barcelona School of Economics

Research Fields:
Behavioural and Experimental Economics, Applied Microeconomics, Public Economics
As an Economist, I’m interested in
• Behavioural Economics
=> Economics + Psychology + Econometrics
• Relaxes rationality assumptions used in traditional models

• Accept bounded rationality due to limited information, limitation of cognitive capability, emotions,
and cultural and social factors.

• Study how agents deviate from the equilibrium and identify the causes for such deviations with the
use of observational data

• Experimental Methodologies
• Testify hypotheses through observing human subjects’ decision-making process in controlled experiments.
Class
◆Class participation:
100% face-to-face class with no hybrid/virtual arrangements
No recordings for later reviewing

◆Core references:
Introductory Econometrics: A Modern Approach, 7th edition, South-Western. Woolridge J.M. (2019)
(Old versions also good, 5th ed in Canvas)
Using R for Introductory Econometrics, 2nd edition, Florian Heiss (2016) (E-book link)

◆Read it for fun:


Mostly harmless econometrics: An empiricist's companion. Princeton University Press. Angrist, J. D., &
Pischke, J. S. (2008).

u Software:
All empirical practices are computed with R: R and R studio
Assessment
1. Quiz: 30%
2. Group Project: 30%
3. Final Test: 40%

• Quiz is for checking the progress of your learning.


Ø Around 30 mins with MCQs and several open questions in each quiz
• Group Project about analyzing dataset with the use of R
Ø Each of you will need to form your own team with other classmates*;
Ø Each team needs to submit an analysis report based on the group-work;
Ø Each team also needs to prepare for a presentation related to their report**.
l Final Test covers all the lecture materials throughout the course.
• Note: No make-up test and no late submission.
* Group size depends on how much students participated in the class. In total, we can have 10 groups at the maximal.
**Each member of the team is required to present.
Tentative Schedule (Details are subject to changes)
What is Econo+metrics

Ragnar Frisch (1895-1973)


• Norwegian Economist, Co-winner of the first Nobel Memorial Prize in Economic Sciences
(1969).
• The first editor of the journal of Econometrica

“…..studies that aim at a unification of the theoretical-quantitative and the empirical-


quantitative approach to economic problems….”

“Experience has shown that these three view-points, that of statistics, economic theory,
and mathematics, is a necessary, but not by itself a sufficient, condition for a real
understanding of the quantitative relations in modern economic life. It is the unification
of all three that is powerful. And it is this unification that constitutes econometrics.”
Why Study Econometrics
Can you tell why these comments are wrong?

“ A low
rel
observed ation in SAT sco
. So SAT re
achievem score is n and GPA in colle
or shows of batting, ent in co ot a good ge is
“A .3 hitter made three po llege.” measure
to predic
so he is gonna make a hit.’ - the offi
ce of XX
t the
ry –
- Baseball game commenta academi
c affairs
-

will be a daughter. Let’s try one more


ns , this time it
“Since we have two so
o sons-
please…” – A father with tw
Why Study Econometrics?

• Econometrics
Ø helps you to quantify “how much” and identify causality that you developed from
economic theory by using data (empirical evidence).
Ø helps you to develop “Intuition” and “Sensitivity” about how things work through data.

Quantify: ‘How much one particular advertisement raised the revenue?’


Identify: ‘Which vaccine is more effective? How significant is this result? ’
Identify and Quantify: ‘Is gender difference caused by social norm? How big the gender difference is?’
Predict: ‘How much the stock price of S&P500 will be in 5 days later? ’

Why Study Econometrics?
After years of trading, John, a derivative trader in Goldman Sachs, started to believe that there are something in
common among traders who tend to perform better under volatility, that is, cockiness. He wanted to validate
his idea;

If you are John,


• How would you formalize your idea?
• What is cockiness and how would you measure one’s cockiness?
• How to define volatility and how to measure it?
• How would you collect the data?
• Conduct an Experiment?
• Collect data on performance from observational Study?
• How would you analyse the data to validate your idea?
• If it was an Experimental study?
Econometrics!
• If it was an Observational Study?
How John analyze?
John’s idea that he wanted to test: Cocky traders earn more.

X Y
Hypothesis Factor Relationship
Target Variable
Regressor between
Dependent Variable
Explanatory Variable X and Y
Independent Variable

Defined as Higher hormone level Return from trading

Measured by Morning hormone level Daily P&L


How John analyze?
<-Outliers John analyzed the data with Interquartile and box plot
Interquartile
<-Max
the 25th, 50th and 75th percentiles are the 1st , 2nd and 3rd quartile.
<-Q3 Especially, the 50th percentile is the median and the 2nd quartile.

<-Median(Q2)

<-Q1
Interquartile Range = 3rd quartile - 1st quartile
<-min

• Summarize the data with the min, the 1st, 2nd, 3rd quartile, and the max

• Sometimes the 10% and 90% are used instead of min & max
How John analyze?

John analyzed the data with Interquartile and box plot

What do you see?


Is there a relationship between cockiness and high P&L? If so, how to quantify it?
Is there a causal relationship? - What determines what?
By how much do the P&L change if the cockiness level in traders increased from low
to high?
Are there other relevant factors determining a high P&L, e.g., education?
Is it possible to forecast all other cocky traders’ profits?

--> Econometrics is needed.


Econometrics cares Qualitative Variables too!!
• Quantitative Variable : measured by a number, usually with a unit
e.g.) age, family size, family income

• Qualitative Variable : not a number by itself, but become a random variable after coding it
with a number
e.g.) marital status (single, married, widowed, divorced, separated)
e.g.) employment status (full time, part time, Temporary)
e.g.) dwelling (HDB 1b, HDB 2b, HDB 3b and more , Condo, Landed Property)
=> most demographic/ socio-economic variables or other categorical variables.

14/23
What Econometrics can do?

• Econometrics

Øoffers solutions for dealing with unobserved factors in economic models,


Øprovide “both a numerical answer to the question and a measure how precise the answer is
(Stock and Watson, 2007, p.7)”
Ø as well be seen later, provides tools that allow to refute economic hypotheses using statistical
techniques by confronting theory with data and to quantify the probability of such decisions to be
wrong,
Øas will be seen later as well, allows to quantify risks of forecasts, decisions and even of its own
analysis.

• Econometrics fills the gap being of “students of economics” and being a “practicing economist”.

15/23
Steps of Econometrics analysis

1. Careful formulation of question/problem/task of interest.


2. Specification of an economic model.
3. Careful selection of a class of econometric models.
4. Selection the right dataset and collecting data.
5. Selection of variables and estimation of an econometric model.
6. Diagnostics of correct model specification.
7. Usage of the model and interpret the results.

Note that there exists a large variety of econometric models and model choice depends very
much on the research question, the underlying economic theory, availability of data, and the
structure of problem.

16/23
Reading
Two paragraphs from “The Myth of Excess Enrollments in College-Becker”

The print media and the blogosphere have many discussions of the high and rising cost of attending
college, the debt burden weighing on college students, the fact that the average real earnings of college
graduates have risen only slowly during the past 40 years, and the difficulty young college students are
having in getting good jobs. The conclusion frequently reached is that too many high school graduates
seek a college degree, and that these graduates have been sold a bill of goods about the value of a
college education.
The facts cited are generally correct, but the conclusion about the low value of going to college is
completely wrong. The fallacy stems from believing that earnings from a college education determine
the benefit of a college education. The truth is that the benefit is determined by the earnings from a
college education relative to what earnings would be if a person stopped her education after high
school. A first approximation to this gain in earnings for the typical person is given by the difference
between the average earnings of college and high school graduates.

Resources from “The Becker-Posner Blog”: https://www.becker-posner-blog.com/


Reading

Questions:
• Find arguments/statements that could be verified or supported with Econometrical approach.
• Mark Independent Variables and Dependent Variable from such statements.
• Pick one statement, what dataset is needed to testify this statement.
• Any qualitative variables needed?
• Whether other factors, i.e., emotion, social norms can be useful to explain the statements?
Economic model
An economic model
Ø has to reduce the complexity of reality such that it is useful for answering the question of
interest;
Ø is a collection of cleverly chosen assumptions from which implications can be inferred
(logically);
Ø should be as simple as possible and as complex as necessary;
Ø cannot be refuted or “validated” without empirical data of some kind.
The example from John, we assume that the earned money of agent i is a function of how
cock this agent is , then we have:

Where
= the money that agent i earns
= the hormone level of agent i
Econometric model
Once we specified an economic model, we need to turn it into econometric model for John,

1. Specify function f depending on the economic theory, i.e., linear, nonlinear.


2. Specify a particular econometric model.

Let’s take a simple linear model as an example:

: parameters of the econometric model.


: contains other possible factors that are not considered in this model
Sample data
Sampling (Stochastic Process)

Population Sample
Parameter Statistics

Inference
A process to find the characteristics of the population
using information from the sample
Representativeness of the Sample Data
◆Importance of the stochastic process of the Sampling

Question.
After hearing that your uncle earned a big return from his stock investment, you asked your father
if he has invested any in the stock market. He said “Unfortunately, No.”
Disappointed, you wanted to estimate how much he could have earned by now if he had invested
just some of his money in the stock market 20 years ago.

How will you estimate the potential rate of return from the past 20 years of investment?
• What data to collect?
• How to analyze?
Representativeness of the Sample Data
To estimate the potential rate of return from the past 20 years of investment, you did the following
step by step:
1. You randomly selected 50 stocks
2. You back-tracked their historical price at 20 years ago
3. You then calculated the rate of return by comparing with their current price.
4. Also, you calculated the weighted average rate of return of these 50 randomly chosen stocks.

Will this estimate accurately represent the actual rate of return from past 20 years of investment?
Or, will this over/underestimate the actual rate of return?
What do you think and why?
Representativeness of the Sample Data
Explain if there was any issue in this sampling process?

Common types pf sampling bias:


Self-selection: People with specific characteristics are more likely to agree to take your survey.
Non-response: People who refuse to participate in your study may differ from those who are in your
survey.
Under-coverage: Some members of a population are inadequately represented in your sample.
Survivorship: Successful observations, people and objects are more likely to be represented in your sample
than the unsuccessful ones.
Pre-screening: The way participants are pre-screened (effect of one advertisement to different people) may
bias a sample.

What you should do?


Common Types of Research data
Observational data
• Captured through observation of a behaviour or activity in the real time world.
• Collected using methods such as human observation, open-ended surveys, or recorded information.
• Required in real time and would be very difficult or impossible to re-create if lost (not easy replicable)
• Need to carefully infer causality since we are not able to manipulate one variable to see the direct effect on the
other i.e., Endogeneity, Reverse causality

Experimental data
• Captured through observation of human behavior in designed experiment.
• Collected through active intervention by the researcher to produce and measure change or to create difference
when a variable is altered.
• Often reproducible or replicable, but may be expensive to do so.
• Allow researchers to determine a casual relationship and it is typically projectable to a larger population.
Common Types of Research data

Simulation data
• Generated by imitating operation of a real-word process or system over time using computer test models.
• Used to try to determine what would, or could, happen under certain conditions.
• The test model is often as, or even more, important than the data generated from the simulation.
• For example, to predict weather conditions, economic models, chemical reactions. (Estimated data)

Derived/compiled data
• Involves using existing data points, often from different data sources, to create new data through some sort of
transformation, such as an arithmetic formula or aggregation.
• This type of data is usually reproducible or replicable once the raw data exists.
• For example, combining area and population data from Singapore to create population density data.
Various Types of Economic Data Sets
Individuals (Person, household, firm, or other economic agent)

Cross Sectional Data


Time Series Data
• One observation per individual
• Indexed by time. • e.g. surveys
• e.g., price, interest • Focus of our module
rate, exchange rate
• Useful in finance
Tim

Pool Cross Panel Data


e

Sections Data
• cross-sectional data
• Cross sectional + Time series
allowing for different
• A set of individuals surveyed
individuals from
repeatedly over time
different times.
Types of Data: Cross-sectional data
u Cross-sectional data refer to observations of many different individuals (subjects, objects) at a given
time. Each observation belonging to a different individual.
u Examples:
Ø A. Gross annual income level of 10 randomly chosen households in Singapore from the year 2022.
Ø B. Cross-sectional data you see in daily life: Oil prices across country.

Household Gross annual


income level
H1 5
H2 2
H3 1
H4 2
…... …..
H10 3 Source:https://knoema.com/infographics/vyronoe/cost-of-oil-production-by-country

(A) (B)
Types of Data: Time series data
u Time series data is a sequence of data points indexed in time order.
u Time series data is for studying how past events influence future events and lags in behavior.
u Examples:
Ø A. Gross annual income level of 1 randomly chosen households in Singapore from the year 2001 -2022.
Ø B. Time series data you may from newspaper: Crude oil prices over time.

Year Gross annual


income level
2001 2
2002 2
2003 3
2004 2
…... …..
2022 4

(A) (B)
Types of Data: Panel data
u Panel data is a dataset in which the behavior of entities (e.g., individuals, countries, companies) are
observed across time. Panel data is also known as longitudinal or cross-sectional time-series data.
u Examples:
Ø A. Gross annual income level of 2 randomly chosen households in Singapore from the year 2001-2022.
Ø B. Panel dataset that you may see in newspaper: Oil production by country over time

Year Household Gross annual


income level
2001 H1 5
2001 H2 2
2002 H1 1
2002 H2 2
…... …..
2022 H1 3
2022 H2 4

(A) (B)
Types of Data: Pool Cross Section Data
u Pool Cross-section data is a collection of cross-sectional data, however, allowing for changing units
across time.
u A pooled cross section us usually analyzed much like a standard cross section data, except that we need
to account for secular differences in the variables (not specific individuals) across the time.
u Examples: Gross annual income level of 9 randomly chosen households in Singapore from the year 2019
and one randomly chosen new household in Singapore from the year 2020.

Year Household Gross annual


income level
2019 H1 5 Advanced Issues in Econometrics.
2019 H2 2
2019 H3 1
2019 H4 2
….. …... …..
2020 H10 3
Summary
In this course, we focus on analysis cross-sectional datasets from both observational and
experimental studies mainly through simple and multiple regression models and their
applications in different business.

We will also recall the arithmetic quality of data:


l quantitative variables.

l qualitative or categorical variables.

Readings: Section 1.1- 1.3 Wooldridge (5th ed in Canvas)

Preparation: Install R in your computer (Or other statistical software that you prefer)
The Hitchhiker’s Guide to Econometrics

“All models are wrong, some are useful.”

“Everything should be made as simple as possible, but not simpler.”


-Albert Einstein

“’Obvious' is the most dangerous word in mathematics.”


-E.T. Bell
This is it for today!

You might also like