Introduction and Subject Matter of Econometrics
Introduction and Subject Matter of Econometrics
1
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
Table of Contents
1: Learning Outcomes
2: Introduction
3.1: Definition
3.2: Scope
3: Methodology or steps in an Econometric study
3.1: Creation of hypothesis or statement of theory
3.2: Data collection
3.3: Specification of the mathematical economic model
3.4: Specification of the statistical/econometric model
3.5: Estimation of the parameters of the chosen econometric
model
3.6: Testing of model specification
3.7: Testing and analyzing the hypothesis derived from the
model
2
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
1. Learning Outcomes
3
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
2. Introduction
Definition:
According to Arthur S. Goldberger, Econometrics is the social science which applies the
tools of economic theory, mathematics and statistical inference to analyze the economic
phenomena.
Econometrics is derived from mathematics and statistics (mainly regression and trending
techniques).
Scope:
From the above definitions we can say that econometrics applies mathematical statistics,
economic statistics, economic theory and mathematical economics. Main aim of
econometrics is to test a hypothesized relationship between an independent (predictor) and
a dependent (predicted) variable. Following are the reasons for why study econometrics:
Mathematical Statistics provides data that cannot be controlled directly. Most economic
data are unique in nature requiring special methods to be developed by the econometrician.
For example, some non-experimental data collected by the public and private agencies, like
data on income, consumption, prices, savings, investments etc. The Econometrician
assumes such data as given. Mathematical statistics normally fail to deal with such special
problems as these data generally consists of measurement errors like omission of relevant
4
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
Economic statistics is used to collect, process, and present economic data in the form of
tables, charts and diagrams. A statistician uses this statistics to collect data on employment,
unemployment, GDP, prices etc. for econometric work.
Econometrics would be beneficial for students majoring in business and economics as they
can be asked to estimate demand and supply functions or to forecast money supply, sales,
interest rates or to estimate price elasticity for products in their job. Thus, econometrics is
an integral part of their training.
5
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
Types of Econometrics:
1. Theoretical
2. Applied
1. Theoretical:
Simple equation: Y = a + bx
Simultaneous equation: Y = a+ b +b
2. Applied:
It is used to obtain practical values of our economic research. It uses the econometric
methods from theoretical econometrics in the application to different fields of economic
study like demand and supply, consumption function etc. It is due to applied econometrics
that researchers have got numerical results to such fields of important economic studies.
6
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
commodities, relative income, taste and preferences of consumers etc. We assume these
factors to be constant or ceteris paribus. But how do we find out this relationship? This
raises empirical question.
Quantitative information is required on the two variables for empirical purposes. For this
purpose, three types of data are generally available:
1. Time Series data: Collection of such data is done over a period of time. It
has separate observation for each time period. For example, data on money
supply, employment and unemployment, GDP, government deficit etc. is
collected at some intervals regularly; such as share prices are recorded daily,
money supply is measured weekly, unemployment rate is calculated monthly,
GDP is calculated quarterly and government budget is announced annually.
The data collected may be qualitative (like gender, race, marital status,
employment status etc.) or quantitative (like money supply, income, prices
etc.) in nature. Trends and seasonality is important in such type of data.
2. Cross-sectional data: Collection of such data on one or more than one
variables is done at one point in time. For example, government of India
collects census of population every 10 years.
3. Pooled data: It has the characteristics of both time series and cross-
sectional data. For instance, data collected on unemployment rate for 20
economies for a period of 10 years, will constitute an example of pooled data
– time series data is the data on the unemployment rate for each economy
for the 10-year period, while cross-sectional data is the data on the
unemployment rate for the 20 countries. Thus, there are 200 observations in
this pooled data.
Panel Data: it is a special kind of pooled data and is also known as micro-
panel or longitudinal data. In this type of data, same cross-sectional unit,
such as household or firm, is studied over a period of time. For instance, the
7
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
q = a + bp
To see how quantity demanded (q) behaves in relation to price (p) of the commodity, we
take an example of demand for sports clubs and its price and then draw a scatter diagram:
8
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
The scatter diagram above shows that price and quantity demanded are inversely related i.e.
there is a downward trend. Using approximation, a straight line has been drawn through the
scatter points and a relationship between price and quantity demanded is written by the
using simple mathematical model as follows:
q=α+βp
The above equation states that quantity (q) holds linear relationship with price (p), α and β
are called the parameters of the linear function. Α is also called the intercept and it gives
the value of q when p is 0. Whereas, β is called the slope and it shows the rate of change in
q per unit change in p. The slope coefficient can be positive or negative, depending on the
relationship between the two variables. In the above example, it is negative.
9
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
We know that quantity demanded also depends upon many random factors like prices of
other commodities, relative income, changing tastes and preferences etc. However, all of
these factors cannot be captured in the model above, and many of them cannot even be
measured. From the scatter diagram drawn above we can say that the relationship between
price and quantity is not exact or linear as all the data points in the diagram do not lie
exactly on the straight line. Thus, we can conclude that the value of q on the basis of
information of price alone will be greater or less than real world value and will be, only by
chance, equal to it. Hence, we can re-write the above equation as:
q=α+βp+u
This is the linear regression model (econometric model) where ‘u’ is a random variable
known as the disturbance or the error term. ‘u’ represents the effect of factors disturbing
the mathematical relationship. This error term represents all those forces which affect ‘q’
but are not explicitly included in the model. Regression analysis is a technique that
models and investigates the relationship between variables. It is the study of
dependence of one dependent variable on one or more explanatory variables.
Simple linear regression model is used to describe linear relationship of one variable on
other. It predicts values of one variable from given values of other variables. Through a
scatter-plot of diagram, it determines the best-fit line, minimizing the sum of squared
residuals and the error variance. Simple linear regression model is represented by:
= + +
10
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
Linear regression states that there is just one line (among many) which best explains the
trend and the relationship between the two variables (with minimum error). Linear
regression model is the prime subject of econometrics. Variable on the L.H.S. i.e. ‘q’ is
called the dependent variable while that on the R.H.S i.e. ‘p’ is called the independent or
explanatory variable as it explains the variation in ‘q’. In regression analysis, we study the
effect of one variable (independent) on the other variable (dependent). However, we should
keep in mind that the relationship between both the variables is not causal. Thus, we can
call the relationship as a predictive relationship, but can we predict ‘q’, for a given ‘p’?
For computation and estimation, we develop the method of Ordinary least squares (OLS).
OLS is an algorithm that defines the values of q, p and u such that the distance between the
actual and the predicted values of q, p and u are at its minimum. Taking the data from the
above demand and price example and using the gretl software, we get the following results:
Model 1: OLS
q
11
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
= 211.315 – 0.354631
By applying OLS, we see that the above equation is found to be the best-fitting regression
line as it is linear, unbiased and minimizes the distance between the line and the data points.
12
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
Where and are estimates of q and p respectively. The estimated value of α is 211.315
and that of β is 0.354631. The symbol hat, i.e. ‘^’, is used to denote the estimate. The
interpretation of this equation is: if the price increases by one unit, ceteris paribus, the
quantity demanded is expected to decrease on an average by about 0.35 units. We use the
words ‘on an average’ since the relationship between the dependent and the independent
variable becomes somewhat imprecise due to the presence of the error term u. Thus, the
estimated regression line gives the relationship between average dependent variable and
independent variable i.e., how ‘q’ reacts to a per unit change in ‘p’. ‘α’ value 211.315 gives
the average value of quantity demanded, if the price were zero, i.e. 211.315 units.
R-squared = 0.904443 shows that model is a good it. If is greater than 0.75, we
normally assume model to be a good fit. We will read it in details in further chapters.
Taking the demand and price relationship, we need to find out how adequate is our model.
We know that the demand for a particular commodity depends on price of that commodity,
income of the individual, prices of related commodities etc. Thus the relationship can be
represented by:
=f( , , , …., , Y)
= + + + Y+ u
Where summarizes the effect of all the factors which are not supposed to change during
the period under study. The above equation is called a multiple regression model, in
contrast to the initial simple linear regression or two-variable/bi-variate model. Previously in
two-variable regression model, there was one explanatory variable, whereas, now there are
several explanatory variables. Here we again added the disturbance term ‘u’ because even if
we keep on adding multiple explanatory variables, we cannot clearly explain the effect of
13
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
the dependent variable in the model. The number of explanatory variables to be included in
the model is the decision of the researcher. The model chosen should clearly reflect the
economic phenomena. However, the underlying economic theory should explain what these
variables might be. We can estimate the above multiple regression in the same way as we
did for simple regression model, for a given data.
3.7 Testing and analyzing the hypothesis derived from the model:
After choosing an appropriate model, we may want to test the hypothesis derived from by
economic phenomena and / or some priori experience. For instance, there is a negative
relationship between price of a commodity and quantity demanded of a commodity. Is this
hypothesis borne out by our results? As per our estimation in the example above, the
estimated coefficient of price was negative, thus our statistical results goes hand in hand
with the hypothesis.
We estimated the regression model for prediction or forecasting. Suppose we have 2015
data on the demand and price of the sports club. Suppose the price of the sports club was
500 units in 2015. If we substitute this value in the estimated regression line:
= 211.315 – 0.354631
We get 33.9995 units as the predicted value of quantity demanded for sports club in the
year 2015 i.e. if price of the sports club in 2010 was 500 units, demand for sports club
would be approximately 34 units. When the real data on price and quantity of sports club
will be available for 2015, we can compare the predicted value with the real value. Any
difference between the two values will give the predicted error. The statisticians try to keep
this predicted error as small as possible.
Like we examined econometric methodology using price and demand relationship, we can
use similar procedure to predict the quantitative relationships between variables in any
theory of economics, politics, sociology, psychology, meteorology, international relations etc.
14
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
4. Summary:
15
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
5. Exercise:
1. Define Econometrics.
2. What is the scope of Econometric study?
3. What are the steps involved in the methodology of Econometric study?
16
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
17
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
6. References:
7. Quiz:
a) α is called the intercept, β is called the slope, and ε is called the residual
b) α is called the slope, β is called the intercept, and εis called the residual
c) α is called the intercept, β is called the slope, and ε is called the error
d) α is called the slope, β is called the intercept, and ε is called the error
18
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
19
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
11.Which of the following are alternative names for the dependent variable
(usually denoted by y) in linear regression analysis?
i) The regressand
ii) The regressor
iii) The explained variable
iv) The explanatory variable
a) The difference between the actual value, y, and the mean, y-bar
b) The difference between the fitted value, y-hat, and the mean, y-bar
c) The difference between the actual value, y, and the fitted value, y-hat
d) The square of the difference between the fitted value, y-hat, and the
mean, y-bar
Answers:
1. a
2. a
3. a
4. a
5. b
6. a
7. a
8. b
9. a
10. a
11. b
12. c
8. MCQ:
1. The regression model includes a random error or disturbance term for a variety of
20
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
2. Which of the following assumptions about the error term is not part of the so called
"classical assumptions"?
21
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
22
Institute of Lifelong Learning, University of Delhi
Introduction and subject matter of Econometrics
to the t distribution
d. when the etsimated coefficient has the opposite sign to that predicted
by theory
e. when you are testing a hypothesis other than that the parameter
equals zero
ANSWERS:
1. e
2. e
3. d
4. b
5. d
6. b
7. c
8. a
23
Institute of Lifelong Learning, University of Delhi