0% found this document useful (0 votes)
24 views27 pages

ch1 Guj

Chapter 1 of the Econometrics I course discusses the nature of regression analysis, its historical origins, and modern interpretations. It explains the difference between regression and correlation, the importance of distinguishing between dependent and explanatory variables, and various types of data used in empirical analysis. The chapter also highlights potential issues with data quality in economic research.

Uploaded by

Baher Ahmednur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views27 pages

ch1 Guj

Chapter 1 of the Econometrics I course discusses the nature of regression analysis, its historical origins, and modern interpretations. It explains the difference between regression and correlation, the importance of distinguishing between dependent and explanatory variables, and various types of data used in empirical analysis. The chapter also highlights potential issues with data quality in economic research.

Uploaded by

Baher Ahmednur
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

lOMoARcPSD|33562427

Ch1 guj

Economics 36 (Shahjalal University of Science and Technology)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Baher Ahmednur ([email protected])
lOMoARcPSD|33562427

ECONOMETRICS I
CHAPTER 1: THE NATURE OF
REGRESSION ANALYSIS

Textbook: Damodar N. Gujarati (2004) Basic Econometrics,


4th edition, The McGraw-Hill Companies

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

HISTORICAL ORIGIN OF THE TERM


REGRESSION
• The term regression is introduced by Francis
Galton.
• He found that, although there was a tendency for
tall parents to have tall children and for short
parents to have short children, the average height
of children born of parents of a given height
tended to move or “regress” toward the averge
height in the population as a whole. This
tendency is called Galton’s law of universal
regression.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

THE MODERN INTERPRETATION OF


REGRESSION
• Regression analysis is concerned with the study of the
dependence of one variable, the dependent variable,
on one or more other variables, the explanatory
variables, with a view to estimating and/or predicting
the (population) mean or average value of the former
in terms of the known or fixed (in repeated sampling)
values of the latter.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Examples of Regression Analysis


1. Reconsider Galton’s law of universal
regression.

We want to find out how the average height


of sons changes, given the father’s height.

Look at the scatter diagram or scattergram


on the next slide.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Figure 1.1 Hypothetical distribution of sons’ heights


corresponding to given heights of fathers.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Examples of Regression Analysis


2. Consider the heights of boys measured at
fixed ages.

Notice that corresponding to any given age


we have a range of heights. Therefore,
knowing the age, we may be able to predict
the average height corresponding to that age.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Figure 1.2 Hypothetical distribution of heights


corresponding to selected ages.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Examples of Regression Analysis


5. A labor economist may want to study the rate
of change of money wages in relation to the
unemployment rate.

Figure 1.3

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Examples of Regression Analysis


6. From monetary economics it is known that, other things
remaining the same, the higher the rate of inflation π, the lower
the proportion k of their income that people would want to hold
in the form of money, as depicted in Figure 1.4 (next slide).

A quantitative analysis of this relationship will enable the


monetary economist to predict the amount of money, as a
proportion of their income, that people would want to hold at
various rates of inflation.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Figure 1.4 Money holding in relation


to the inflation rate π

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

STATISTICAL AND DETERMINISTIC


RELATIONSHIPS
• In the regression analysis we are concerned
with that what is known as the statistical, not
functional or deterministic, dependence
among variables, such as those of classical
physics.
• In statistical relationships among variables we
essentially deal with random or stochastic
variables. These variables have probability
distributions.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

REGRESSION VERSUS CAUSATION


• Although regression analysis deals with the
dependence of one variable on other
variables, it does not necessarily imply
causation.
• A statistical relationship per se cannot
logically imply causation.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

REGRESSION VERSUS CORRELATION


• In the correlation analysis we try to measure
the strength or degree of linear association
between two variables. The correlation
coefficient measures this strength of (linear)
association
• In regression analysis we try to estimate the
average value of one variable on the basis of
the fixed values of other variables.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

REGRESSION VERSUS CORRELATION


• In correlation analysis we treat any two
variables symmetrically. There is no
distinction between variables. Both variables
are considered random.

• Most of the regression theory is based on the


assumption that the dependent variable is
stochastic but the explanatory variables are
fixed or nonstochastic.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

TERMINOLOGY
Dependent variable Explanatory variable
Explained variable Independent variable
Predictand Predictor
Regressand Regressor
Response Stimulus
Endogenous Exogenous
Outcome Covariate
Controlled variable Control variable

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

TERMINOLOGY
• In a simple (two-variable) regression analysis
we study the dependence of a variable on
only a single explanatory variable, such as
that of consumption expenditure on real
income.
• In a multiple regression analysis we study the
dependence of one variable on more than
one explanatory variable, such as that of
money demand on interest rates, income, and
inflation.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

TERMINOLOGY
• The term random is a synonym for the term
stochastic. A random (stochastic) variable is a
variable that can take on any set of values,
positive or negative, with a given probability.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

NOTATION
• Y: dependent variable
• …
X1, X2, , Xk : explanatory variables
• Xk : kth explanatory variable
• Xki : ith observation on variable Xk (cross-sectional data)
• Xkt : tth observation on variable Xk (time series data)
• N (or T): the total number of observations or values in
the population.
• n (or t): the total number of observations in the
sample. (time series data)

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

TYPES OF DATA
• There are mainly three types of data for
empirical analysis:
1. Time series data
2. Cross sectional data
3. Pooled data

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Time series data


• A time series is a set of observations on the
values that a variable takes at different times.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Cross-sectional data
• Cross-sectional data are data on one or more
variables collected at the same point in time.
GPA study hours/week
3.5 10
2.7 8
1.9 9
2.3 5
2.0 8
2.2 6
2.5 3

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Pooled data
• In the pooled data there are elements of both
time and cross-sectional data.
time GPA study hs/week
2000 2.5 9
2000 2.7 8
2000 2.3 6
2005 1.9 5
2005 3.1 12
2010 2.4 7
2010 2.0 5
2010 3.9 11
2010 1.2 2

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

• Panel data is a special type of pooled data in


which the same cross-sectional unit is
surveyed over time.
person time GPA study hs/
week
1 2010 2.5 9
1 2011 2.7 7
1 2012 2.3 6
2 2010 1.9 8
2 2011 3.1 12
2 2012 2.4 6
3 2010 2.0 5
3 2011 3.9 11
3 2012 1.2 2

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

Sources of Data
• Government agencies (Department of
Commerce...)
• International agencies (World Bank...)
• Surveys

In the social sciences the data that one generally


obtains are nonexperimental in nature, that is, not
subject to the control of the researcher.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

The quality of data which are used in


economics is often not that good.
1. Possibility of observational errors.
2. Approximations and roundoffs.
3. Nonresponce to surveys may cause
selectivity bias.
4. The sampling method used in obtaining the
data may vary so widely that it might be very
difficult to compare them.

Downloaded by Baher Ahmednur ([email protected])


lOMoARcPSD|33562427

5. Economic data are generally available at a


highly aggregate level. Such highly aggregated
data may not tell us much about the
individual or micro level units (GNP...) .
6. Because of confidentiality, certain data can
be published only in highly aggregate form
(health data...).

The researcher should always keep in mind


that the results of research are only as good
as the quality of data.

Downloaded by Baher Ahmednur ([email protected])

You might also like