Unit 1 PDF
Unit 1 PDF
ECONOMETRICS
( : Structure
1.0 Objectives
.' I
.. 1 , 1.1 Introduction
.- 1 1.2 The Nature of Econometrics
1.3 Probability Distributions
1.3.1 Discrete Probability Distribution
13.2 Continuous Probability Distribution
,<
: I
! 1.4 Sampling Distribution
1.5 Statistical Inference
1.5.1 Estimation
1.5.2 Hypothesis Testing
1.6 Software Packages for Econometric Analysis
1.7 Let Us Sum Up
1.8 Key Words
I .9 Some Useful BookslReferences
1.0 OBJECTIVES
, After going through this unit you will be in a position to:
explain why we should study econometrics;
: appreciate the scope of econometrics; and ?.
1.1 INTRODUCTION +
e) Once estimates are obtained, the next tasks testing of the hypothesis put
forth in the first step above. Basically it amounts to statistical
significance of the estimates obtained by us.
1) p ( X , ) 2 0 for all X , E R
Introduction-to "
Econometrics
!
a I Normal Distribution
1
2 ;
% i Normal distribution is perhaps the most widely used distribution in Statistics
T i and related subjects. It has found applications in inquiries coilcerning heights
and weights of people, IQ scores, errors in measurement, rainfall studies and
so on. The probability density function p(x) of a continuous random variable
that follows the normal distribution is given by
a) 68.8 % of the area under the normal curve lies between the ordinates at
p - a and p + a. Thus in Fig. 1.2,68.8% area is covered when
* ,
x ranges
between 46 and 54.
b) 95.5% of the area under the normal curve lies between the ordinates at
' p - 2 a a n d p + 2 a . InFig. 1.7 95.5%areaiscoveredw 9
4 2 5 X 558.
Basic Econometric Theory 99.7% of the area (i.e., almost the whole of the distribution) under the
c)
normal curve lies between the ordinates at ,u - 3 0 and ,u + 30. In Fig.
15.2 we find that 99.7% area is covered when 38 I X S 62.
-
t with n = 10
8
Fig 1.3: Studefit'a-t Probability Curves
1) As we can see in Fig. 1.3, like the normal distribution, the student's-t
distribution is also symmetric and its range of variation is also from -a,
to +a,; however, it is flatter than the normal distribution. We should note
that as the degrees of freedom increase, the student's-t distribution
. approaches the normal distribution.
.,'
Another continuous probability distribution that finds use
the F distribution.
* . ii.1 econometrics is
If z, and z, are hw chi-square variables that are independently distributed with k,and k,
degrees of ~ o kspectively,
m the variable
F=- 2 1 4
22 /k2
follows F distribution with k, and k, degrees of fnxdonl respectively. The
variable is denoted by 4,,& ,k ,
the subscripts k, and k, are the degees of freedom
associated with the chi-square variables.
We may note here that k, is called the numaator degrees of freedom and in the same
way, k2is called the denominator d e w of freedom.
I
.. . . .
li
. -,
. .,. < ; ..> < ..
. -.. :
, 2 ..*. . . . . . . . c. ' ~ .
~. ~, + . . 5 , ! '
the p'roviiion o f the F ~ a b l e The . critical' .value foi F distribution i$ ,given'ini , .. jcr:,.
. ~ ~ b~l L' i f ' ihe
j ~ iend of t ~ ~e l b ~ k . : . .:
;;.
. . , . /
. .
.. . . #;.' . ,
. . , .-
. , ., : : , . . . .
, . 1
:. . . .., J . '. .I
> , ! ' , .
. :!,
. . . . . .
, . I . .....
'.
.
.
. . .,
. . .. : , .. . c.'. . : . ? .<;!..:.,;.; .; .>,,, :
samples from8 :giv&hpopulitibii and each sainple provides us with sample .. ,!.'
mean.
As sample mean (i) assumes different values and foi each ialbe we can '
. . it is considered as .. a,.random..variable.:In;real.,;life
attach. ceqtain probability, ,,
. Now let u s consider q p t h q important conoept:; the central limit fkorenr: It.::
> . . # . .
&
*. .,
. ./ ' .,:,
. , ..: .. c > . . . . . : -,;j: . !
~ s u a l lwe
' ~ conside;;
.. iample i o be Large in siz{:if .n> 3.0. For small samples
. .,.!! , ! l j
8 =8 with probability 9
8= n with probability
) lim ~
AE(~= ( 8 ) as n + or,.
We observe that an unbiased estimator is asymptotically unbiased but the
reverse is not essentially true. For example, suppose we modifL our previous
example as
b =8 with probability
8= n with probability $
In this case
Basic E c o ~ o m e t r i cTheory
b i a r = ~ ( e ) - ~ ~ @ ( q j + f i : ( l ) - !~, -:- f s - h - E n L I1
' I
*2 ' , (~.
'.
Thus tk,dmve, stati8tic is aot .unbiased. fh the'limiting 'case, as n -+ co ,
hawever,'lim ~ ( 6=6!
) Thus it is aymgtoti<oll~:1!n5:;lsed. .
a ' - !?>)
2
.
%
r 4'
I . ,
Interval Estimation , ,
The point estigate may q ~ bet realistic in the sc.nsc that the parameter value
may not exactly be e q d to it. An altema$ivc yrocr;durc is to give ;in inter~~al,
which would hold the with certain probability.. Here we specif) a
lower limit and upper limit within which the parameter value is likaly to
,
remain. Also we spe& the probability of thr parmeter remaining.in the
interval. We call the interval as 'confidence interval' and the pryb~hiiityof
the parameter rem$ning ,within this interval as,,,'co~fidenoe.level3 or
'confidence coefficienti. ,,,
-
j
, . ,' -k.
y' n
where n is the siik of the
x-p
sample. By transforming the sampling distribution ( z =, ) we obtain
o/&
standard normal variate, which has zero m w andl unit variance. The
standard normal curve is symmetrical and therefore, the area under the curve
for 0 Iz < a is 0.5 as we can see from Table A1 given at the end of the Block.
If we want our confidence coefficient to be 95 per cent (that is, 0.95), we find
out a range for z which will cover 0.95 area of the stahd&d normal durve.
Since distribution of z is symmetridal, 0.475 area should reinain 'td the right
and 0.475 area should remain to the left of z = 0 . From Table A l we find that
0.475 area is covered when ~ 1 . 9 6 . Thus the ,probability that z rpnges
between -1.96 to 1.96 is- 0:95. Frdm this' ihfo'nnktion" let' us w0i.k' out '
; i j,
backward and find the rd&e %thin which ,uwill krhain. % .
2 .
,
-
, > . - L a '
,
We find that 1 . . > I t J
,.
Equatioo (1.8) implies that 99 per cent confidence interval for p is given by
3 + 2.58-.@
4i
By looking into the nomal area table you can work out the confidence
interval for confidence coefficient of 0.90 and find that
We observe from (1.7), (1.8) and (1.9) that as the interval widens, the
probability of the interval holding a population parameter (in this case p)
increases.
The two limits of the confidence interval are called conjidence limits. For
example, for 95 per cent confidence level we have the lower confidence limit
I
T
x + 1.96-.0
as x' - I .96--
&
and upper confidence limit as
- The confidence
J;;
coefficient can be interpreted as the confidence or trust that we place in these
limits for actually holding p.
There is a possibility that the null hypothesis tliat we iiitcnd to tcst is not tr~lc
and female literacy in Orissa is not equal to 5 1 pet. cent. 1 hus there is a need
for an alternative hypothesis which holds truc rn case thc null hypothesis is
not true. We denote the alternative hypothesis ht tlie sqnibol H , and
formulate it as follows:
We have to keep in mind that null hypothesis and alternative hypothesis arc
mutually exclusive, that is, both cannot he true simultaneously. Secondly,
both H , and H ., exhaust all possible options regarding the parameter, that is.
there cannot be a third possibility. For example, in the case of female literacy
in the village, there are two possibilities - litetacy rate is 51 per cent or it is
not 5 1 per cent; a third possibility is not there.
It is a rare coincidence that sample mean ( \ - ) is equal to population mean
(p). In most cases we find a difference between :r and p . Is the difference
because of sampling fluctuation or is there a genuine difference between the
. sample and the population? In order to answer this question we need a tcst
statistic to test the difference between the two. The result that we obtatn by
using the test statistic needs to be interpreted and a decision nccds to be taken
regarding whether the null hypothes'is be rejected or not.
Rejection Region
While discussing confidence interval we mentioned if confidence level is 95
per cent then 5 per cent area of the standard normal curve remains under the
rejection region. Let us look into the standard normal curve presented in Fig.
1.5, where the x-axis represents the variable z and the y-axis represents the
probability of z, that isp(z). If thc estimate falls under the rejection region
then the null hypothesis is rejected. Otherwise, the hypothesis is not rejected.
If the absolute value of z is less than the critical value we should not
rqjedt the null hypothesis.
If the absolute value of z exceeds the critical value we should reject the
null hypothesis and accept the alternative hypothesis.
Thus in the case of large samples the absolute value of z can be considei-edas
test statistic for hypothesis testing such that
When we have a significance level of 5 percent, the area covered under the
standard normal curve is 95 per cent. Thus 95 per cent area under the curve is
bounded by - 1.96 s z I 1.96. The remaining 5 per cent area is covered by
z 5 -1.96 and z r 1.96. Thus 2.5 per cent of area on both sides of the standard
normal curve constitute the re.jection region. .
For small samples ( n 530), if ~opulatiopstandard deviation is known we
apply z-statistic for hypothesis testing. On the other hand, if population
standard deviation is not k n o d we apply t-statistic. The same criteria apply
to hypothesis testing also.
In the case of small samples if population standard deviation is known the test
statistic is
On the other hand, if population standard deviation is not known the test
statistic is
In the case of t-distribution, however, the area under the curve (which implies
probability) changes according to degrees of freedom. Thus while finding the
critical value of t we should take into account the degrees o'f freedom. When
sample size is n, degrees of freedom is - I . Thus we should remember two
Basic Econometric Theory things while finding critical value of 1. These are: i)'significance level. and ii)'
degrees of freedom.
One-tail and Two-tail Tests
In Fig. 1.4 we have shown the rejection region on both sides of the standard
normal curve. However, in many cases we may place the rejection region on
one side (either left or right) of the standard normal curve.
area is placed on both sides of the standard normal curve. Rut if it is a one-tail
test then cr area is placed on one-side of the standard normal curve. Thus the
critical value for one-tail and two tail test d i f i r .
The selection of one-tail or two-tail test depends upon the formulation of the
alternative hypothesis. When the alternative hypothesis is of the type
H , :F # p we have a two-tail test, because Y could be either greater than or
less thin p. On the other hand, if alternative hypothesis is of the type
ti, : x < p , then, entire rejection is on th"e left hand side of the standard
normal curve. Similarly, if the alternative hypothesis is of the type
H , : i> p , then the entire rejection is on the right hand side of the standard
normal durve.