Econometrics Aio
Econometrics Aio
Three types of data may be available for empirical analysis: time series, cross-section, and
pooled (i.e., combination of time series and cross-section) data.
Time Series Data
● A time series is a set of observations on the values that a variable takes at different
times. Such data may be collected at regular time intervals, such as daily (e.g., stock
prices, weather reports), weekly (e.g., money supply figures), monthly (e.g., the
unemployment rate, the Consumer Price Index [CPI]), quarterly (e.g., GDP), annually
(e.g., government budgets), quinquennially, that is, every 5 years (e.g., the census of
manufactures), or decennially, that is, every 10 years (e.g., the census of population).
● a time series is stationary if its mean and variance do not vary systematically over time.
With the advent of high-speed computers, data can now be collected over an
extremely short interval of time, such as the data on stock prices, which can be obtained
literally continuously (the so-called real-time quote).
Cross-Section Data
● Cross-section data are data on one or more variables collected at the same point
in time,such as the census of population conducted by the Census Bureau every
10 years (the latest being in year 2000).
● cross-sectional data too have their own problems, specifically the problem of
hetero-geneity. From the data given in Table 1.1 we see that we have some
states that produce huge amounts of eggs (e.g., Pennsylvania) and some that
produce very little (e.g., Alaska). When we include such heterogeneous units in a
statistical analysis, the size or scale effect must be taken into account.
Pooled Data
In pooled, or combined, data are elements of both time series and cross-section data.
Panel Data
This is a special type of pooled data in which the same cross-sectional unit (say, a family or
a firm) is surveyed over time. For example, the U.S. Department of Commerce carries out
a census of housing at periodic intervals. At each periodic survey the same household
(or the people living at the same address) is interviewed to find out if there has been any
change in the housing and financial conditions of that household since the last survey. By
interviewing the same household periodically, the panel data provide very useful informa-
tion on the dynamics of household behavior.
WHAT IS ECONOMETRICS?
Econometrics, the result of a certain outlook on the role of economics, consists of the applica-
tion of mathematical statistics to economic data to lend empirical support to the models
constructed by mathematical economics and to obtain numerical results.
Econometrics may be defined as the social science in which the tools of economic theory,
mathematics, and statistical inference are applied to the analysis of economic phenomena.
Methodology of Econometrics
1. Statement of theory or hypothesis.
2. Specification of the mathematical model of the theory.
3. Specification of the statistical, or econometric, model.
4. Obtaining the data.
5. Estimation of the parameters of the econometric model.
6. Hypothesis testing.
7. Forecasting or prediction.
8. Using the model for control or policy purposes.
To allow for the inexact relationships between economic variables, the econometrician
would modify the deterministic consumption function in Eq. (I.3.1) as follows:
Y = β1 + β2X + u (I.3.2)
where u, known as the disturbance, or error, term, is a random (stochastic) variable that
has well-defined probabilistic properties. The disturbance term u may well represent all
those factors that affect consumption but are not taken into account explicitly.
WHY IS ERROR TERM NEEDED (EXAMPLE)?
if we were to obtain data on consumption expenditure and disposable (i.e., aftertax) income of a
sample of, say, 500 American families and plot these data on a graph paper with consumption
expenditure on the vertical axis and disposable income on the horizontal axis, we would not
expect all 500 observations to lie exactly on the straight line of Eq. (I.3.1) because, in addition to
income, other variables affect consumption expenditure. For example, size of family, ages of the
members in the family, family religion, etc., are likely to exert some influence on consumption.
Obtaining Data
To estimate the econometric model given in Eq. (I.3.2), that is, to obtain the numerical
values of β1 and β2, we need data.
Estimation of the Econometric Model
The numerical estimates of the parameters give empirical content to the consumption function.
the statistical technique of regression analysis is the main tool used to obtain the estimates.
Hypothesis Testing
● Assuming that the fitted model is a reasonably good approximation of reality, we have to
develop suitable criteria to find out whether the estimates obtained in, say, Equation I.3.3
are in accord with the expectations of the theory that is being tested.
● According to “positive” economists like Milton Friedman, a theory or hypothesis that is
not verifiable by appeal to empirical evidence may not be admissible as a part of
scientific enquiry.
● confirmation or refutation of economic theories on the basis of sample evidence is
based on a branch of statistical theory known as statistical inference (hypothesis testing).
Forecasting or Prediction
If the chosen model does not refute the hypothesis or theory under consideration, we may
use it to predict the future value(s) of the dependent, or forecast, variable Y on the basis of
the known or expected future value(s) of the explanatory, or predictor, variable X.
Primary Data: Primary data are measurements observed and recorded as a part of an
original study. When data reqd for particular study can neither be found in internal
records of enterprise nor in published sources, it becomes necessary to collect original
data.
Method of obtaining:
1. Questioning: data collected by asking ques from people who are thought to have
desired information.
2. Observation: when data is collected by observation. Investigator observes
objects or actions and records this observation.
Secondary Data: when investigators use data that has already been collected by others.
It can be obtained from journals, reports,govt publications, etc.
At the national deta, census, National Sample Survey Organisation (NSSO), Researve
Bank of India collects data. Source of the data must be given if using the secondary
data.
GOODNESS OF FIT
How “well” the sample regression line fits the data. The coefficient of determination r2
(two-variable case) or R2 (multiple regression) is a summary measure that tells how well the
sample regression line fits the data.
CONFIDENCE INTERVAL
Confidence intervals are also called interval estimates because they provide a range of
likely values for the population parameter, and not just a point estimate.
the width of the confidence interval is proportional to the standard error of the estimator.
The goal of the OLS method can be used to estimate the unknown parameters (b1, b2, …,
bn) by minimizing the sum of squared residuals (RSS). The sum of squared residuals is also
termed the sum of squared error (SSE).
This method is also known as the least-squares method for regression or linear regression.
TESTS OF HYPOTHESIS