0% found this document useful (0 votes)
215 views8 pages

Fundamentals of Statistics

Uploaded by

PrashansaBhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
215 views8 pages

Fundamentals of Statistics

Uploaded by

PrashansaBhatia
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

16-6 FUNDAMENTALS OF STATISTICS

distribution (B. D.) when wie


Cxample, if' we want lo cstimatethe parameter P'of thc binomial
We cquate the meanof B.D vhich is nP, lo the sample mean x, hus giving known, then
nP -N

first two moments, Thus solving


. .(6-17)
Wboth h and Pare unknown. then we takc the
Mean nP= and Variance = nPQ =nP(|- P)-s

for P and n, (since the sample mean xand samplce variance s are known from the saple valucs\, we ge
the corresponding estimates of the population paramelers.
This technique is commonly uscd if we have to cstimate thc theorctical trequencies of o
distribution by fitting an appropriate probability distribution to il.
Inthe above discussion, thereader is exposed to basic conccpts and notions in point cstimation, A
detailedstudy of thesc topics is, however, beyond the scope of the book.
16-3. SAMPLING DISTRIBUTION OF A STATISTIC
As alrcady statcd, sampling distribution of a statistic plays a very inportant role in the stalistical
inference - both in the cstimation theory as well as testing of hypothesis. In Chapter 15, [$ 154-1] we Baye
the concept of sampling distribution of astatistic. We will rccapitulate the ideas briefly and study its use in
detail.
If we take a sample of size n from a population of size N, then here are C, = k; (say). possble
samples. We can compute the statistic, say, t for cach of these samples. Let ,, h, . be the values of the
statistic for these kpossible samples. Thus, the statistic tmay be regarded as a random variable which can
take any one of the values ,, b, ..., i. The set of the values of t constitutes, what is known as the sampling
distribution of the statistic t.
We can compute the mcan, variance and other statistical constants of the sampling distribution of the
statistic 1. For example,
...(1618)
Mean - E()-=,(say)
and Var (t) - -Eo]'-}«-i2 ... (l6-19)

The standard deviation of the sampling distribution ofa statistic is known as its standard error (S.E).
. .(l6:20)
SE () =\Var (/)=(-)2
The sampling distribution of a statistic and its standard error are extensivcly used in Statistics :
(i) To deermine the precision of the sample estimate of some population parameter, which is grve
the reciprocal of the S.E. of 1hesampling distribution of the estimate. Thus, if is a statistic used to ct
the paramcter 0 then
. .(16:21)
Prccision of =
S.E. ()
value in the
(ii) To test if the sample statistic differs significantly from the corresponding hypothetical
the ditterence
Can
population, i.e., to test the significance of the difference (t - 0). By this we mean: tluctuations O
Ieween the sample estimate und the corresponding population, Parameter be attributed to
ampling thesae
estimates of
(iii) To test the significance of the difference between two independent sample
population paramcter.
(iv) Toobtain point estimates of the population parameters.
TESTING OF HYPOTHESIS 16-7
THEORY 0F ESTIMATION AND
ahiain interval estimates of the population parameters. i.e., to obtain probable limits bctween
thc truc valuc of the paramcter may be cxpccted to Jic. Thesc limits are provided by
which
... (1622)

to/)isthe significant or
critical value of t al level of significancc o foratwo-tailed test. [For details
where
sec§ 1669].
1B4:1. Sampling Distribution of Man. We have alrcady discusscd in §1512that ifx,x), ... . x,is
population of size N. then
arandom sample (without replacement) of sizen rom a tinitc
Var (x ) V-1 . .(1623)
n/N
CNoo, i.e., if wc take samplcs from an infinite (very large) population so that thcsampling fraction
can be neglcctcd, thcn rewriting (16:23), we get
Var (r) ~ 0 , - n g
N
[.N NifNis large]

Var (x ) = ... (16-24)


ncglccting samplc fraction n/N which will be very small as N’ oo, The result (16-24)also holds if we draw
random sample with replacement from a finite population, because sampling with replaccment amounts o
drawing samplcs from an infinite population.
Sampling distribution of mean possesses the following propertics :
() Sample mean ( ) is an unbiased estimate of the population mean , i.e.,
E(X) =p . (*)
ie, the mean of allpossible values of sample means is cqual to the population mean. This does not depend
on the sample size n.
(ii) Thevariance of x depends on the sample size (n) and is cqual to a²/n. In other words, variance (x)
is inversely proportional to n. Therefore, larger the sample size, thc smaller is S.E. (x ) and conscquertly,
the more cfficient is the estimator x.

In addition to the above properties, the sampling distribution of x has a very important asymptotic
propcrty i.e., as n’ o, This is embodied in the Central Limit Theorem given bclow.
16:3-2. Central Limit Theorem. We know that if x, x, , x, is a. random sample from normal
population with mcan u and variance o², then the sample mean x is also normally distributed with mean u
and variance a/n, i.e.,x ~ N(u, o²/n). This result is true cven if the population from which the samples are
rawn is not normal, provided the sample size is sufficiently large as statcd in the following Central Limit
Iheorem, which is one of the most fundamental theorems in Statistics.
y X, X, ..,. x, is a random sample of size n from any population, then the sample mean (x ) is
nornally distributed with mean u and variance g'/n provided n is sufficiently large. In other words,
X~N (Lu, g'/n), asymptotically i.e., as n ’ , Larger the sample size, better will be the approximation.
Om practical point of view, the approximation is fairly good if n> 30.
Remark. In fact, above result is simply a deduction from thc more generalised central limit theorem
Which is statcd below:
Central Limit Theorem. "If X, X2, ..., X, are independent random variables following any
diassytrmptibutoitoicn,ally normally
then under certain very general conditions, their sum X = X + X, + . + X,, is
distributed,i.e., Xfollows normal distribution as n ’ o,
16-8 distributions of most of
FUNDAMENTALS OF STATISTICS
becn proved that thc sampling thc stalislics, like
it has
By using this thcorem, (P - P2); difference of
difference of sample
proportions
sample proportion (p),standard deviation (s - s;) etc., are )
asymptotically normal, i.e., mans
the
sample
difference of sample
corresponding to any one of
variates
then by central limit
thcorem,
thesc statistics is N(0, for large samples. Thus, if t
is any standaslrdaliisscic,
1- EO ~ N(0, 1)
oo, This
S.E. ()
result, viz., the distribution of any
Sample
stutistic in its
f
. .(1625)
standardised
asymptotically, i.e., as n
n’ o, is Cxtensively used in Large Testslarpe (Chapter
17] and form is
asymptotically normal as
confidence limits for the population parameters when samples are also in
constructing
Example 16:2. Apopulation consists of five numbers (2, 3, 6, 8, I). Consider all possible samples of
size two whichcan be drawn with replacement fromn this population. Calculate the standurd| error of sample
mean. Solution. In random sampling with replacement, any onc of the five numbers 2, 3, 6, 8, I| drawn in
the first draw can be associated with any one of these five numbers drawnin the sccond draw and hence the
total number of possible samples of size 2 is 5 x5 = 25 and is given by the cross product,:as given below,.
(2, 3, 6, 8, 11) x(2, 3, 6, 8, 11)
(2, 3) (2, 6) (2, 8) (2, 11)
(2, 2) (3, 8)
(3,2) (3, 3) (3, 6) (3, 11)
(6,3) (6,6) (6, 8) (6, 11)
(6, 2)
(8, 2) (8, 3) (8, 6) (8, 8) (8, 11)
(11,2) (11, 3) (11, 6) (11,8) (11, 11)
The corresponding sample means are :
Total

2-0 2:5 4-0 5-0 6:5 20-0

25 30 4-5 55 .7-0 22-5


7-0 8:5 30-0 ...)
40 4:5 6-0
50 5:5 7-0 8-0 9:5 35-0
6:5 7-0 8:5 95 11:0 42:5
Total 150-0
The mean of the sampling distribution of sample mean is

E() = 25 150
25
'=
6

The variance of the sampling distribution of the mean is given by


Var() i-E(7)
-(2-6)²+ (25-6)+ (4- 6)2+ + (9-5 - 6)²+ (|| -6
3)2+(-15}
-4)+ (-3-5 + -2)2 +(- 1)2+ (0:5)² +(-3:5)2 +(-
+(-1
+(-0:5)2 + (1) +(-2)+ (- 15) +(0) ++(1)² +(2-5) (3:5)+(9|
+(-0-5)2+ (1 +(2)+(3-5)2 +(0-5)2+ ()2+ (25)² +
I+4+2.25+0
= 16 + 12-25 +4 + I +0-25 + 12-25 +9 + 2:25 + 0:25 + 25/
12-25+
+
+1+625 + 1 +0:25 + 1 +4 + 12-25 + 0-25 +|+6:25
135
25540
ESTIMATION AND TESTING OF HYPOTHESIS 16-9
OF
HEORY
VVar(E) Vs-40 = 2:32.
SE (*) =
=

the S.E. of sample mean in random sampling from an infinite population or in


Weknowthat
Aliter. replaccment is given by
sampling with
Var(r)= S.E. (*) = wherc g is the population s.d.

population values are 2, 3, 6, 8, 11.


The
u =2+ 3+6+8 +1)==6 ... (iin)

!(2-6)2+ (3 6)2 +(6-6+ (8 -6)+(11- 6|


-16 +9+0+4+25| ... (iv)
54 54
Var (*) n 5x2 10
=54 S.E.(x ) =V54 = 2:32
(iii) that
Demark 1. It may be observed from (ii) and
E(x)= 6=
Sample mean is an unbiased estimate of the population mean.
,Somnling Without Replacement. If we have to draw samples without replacement, then we shall
mplude those samples where the values are duplicated viz., the samples (2, 2). (3, 3), (6, 6). (8, 8) and
ILI)and in this case the variance of the sampling distribution of xwill be given by [c.f. (16-23)] :
Var (+) = N-1

Example 16:3. Apopulation consists of 4 members 0, 4, 6 and 6. Draw all possible samples of size 2
tawn with replacement. Find the sapling distribution of sample mean. Hence find the mean and variance
ofthe sample mean. [L.C.WA. (Intermediate), Dec. 1999]
Solution. Population values are given to be :0, 4, 6 and 6.
Since the samples of size 2 are drawn with replacement, there are 4 x 4 = 16 samples, which are given
ynt cross product {0, 4, 6, 6} x {0, 4, 6, 6} and are enumerated in the following table.
Sr. No.
Sample Sample Sample Sr. No. Sample Sample Sample
values totals mean values totals mean
0,0 0 9 6.0 6
2
0,4 4 2 10 6, 4 10
3
0, 6 6 3 6, 6 12 6
4
0, 6 6 3 12 6, 6 12 6
4.0 13 6, 0 6 3
6
4, 4 4 14 6, 4 10
4,6 10 15 6, 6 12 6
4,6 10 5 16 6,6 12 6
The
Valuseamofplsanple
ing distribution of sample mean (x ) is given below.
Irequency (è) mean n(x) 2 3 4 6 Total
}f- l6
Probabil ty 1 2 4 4
2 4 4 4
16 16 16 l6 16 16 16
16-10 FUNDAMENTALS OF STATISTICS
6x4
E(x) - r p (r) =0 + 2x +3xit 4xt 5x+
4+12 +t41 20 +24) 64 = 4
2 + 25x. + 36 Xi46
E(?) x'.p(r ) =0+4 x 16 + 9x + 16x
(8+36 + 16+ 100 + 144) = 304
16 19

16= 3.
Var (x) = E( x?)- [E(x )] = 19- 42= 19-
Example 16:4. The narks scored by five students in a test of Statistics carrying 100 marke ous ta
is drawn without replacemen, draw up
50, 60, and 40. Ifa simple random sample of'size 4 the
distribution of sample mean. Hence, find the standard error of the sample mean. sampling
J.C.W.A. (Intermediate), Dec. 19971
Solution. From a population of5 units, asimple random sample without replacemcnt, of size 4can be
drawn in SC, =SC, =5, ways. Hence, the total number of possible samples is 5, as gIven in the
table. fol owing
Sample o. Sample values Total
(1) (2) (3)
Sample
mean

(4)=9
[The samples of size 4 are obtained on 1 50, 60, 50, 60 220 55
picking 4 observations out of the 5 given 2 60, 50, 60, 40 210 52.5
observations 50, 60, 50, 60 and 40, starting 3 50, 60, 40, 50 200 50
with the first obscrvation viz., 50 and then 4 60, 40, 50, 60 210 52.5
moving in a cyclic order.] 40, 50, 60, 50 200 50

Sampling Distribution of Mean. In the above table, we observe 50 52.5 55


that out of the five values of sample mean (x), each of the values 50 P(3) =04 =04=02
and 52-5, occur twice. Thus, the sampling distribution of mean (x) is
as given in the adjoining table
E() -xp(x)= 50 x 0-4 + 52-5 x 0:4 + 55 x 0-2 = 20 + 21 +1| = 52
E(1) =E(Tp () - 502 x 04 +(52:5) x0-4 +552x 02= 1000 + 1102 5+605 =27075
:: Var (x) =E(* 2)- [E(*))2 =2707:5 522= 2707-5 2704 =35
’ SE.(T) =VVar (+)-3:5 =18708.
Example 165.Consider a hypothetical population comprising only three values : 2, Sand 8. Dri
possible samples of size 2 and caleulae the mean and variance s' = fa, -)?+ (*- x}|jor eaet
sample. Examine whether the statistics are unbiased for the corresponding parameters. Wnar
sampling variance of () x and (i) s²? of
the total number
Solution. The population consists of three values (2, 5, 8). As in Example 162,Nine sample pairs are
samples of size 2is 3x3 =9and is given by the cross product (2, 5, 8) x(2, 5, 8).
given below :
(8,8).
(2, 2), (2, 5), (2, 8), (5,2), (5, 5), (5, 8), (8, 2), (8, 5).
Population parameters. Population mean u and variance o are given by
...(**)
2}8-s;...*) and o'-2-sy+ (5-5)+ (8s)-=6
16-11 ithe 40be 2009]
probabiliy
=9-00
5)²
(2- 2:25 2:25 2-25 2:25 =9-00
(8-
5) 27 is will
=0
5)2
(5- -0
5)²
(5- =0 = projectHons.),
)1?
[x-E(x = - = = mnean
MEAN 5)² 5)² 5)? )² 5)? 2
)) (--75)²=0-5625=0:5625I01-25
(75)2 =0-5625
0-5625
=
- (3:5- - 5- (x [²-
E(s))?=9-00 36-00
= =9-00
36-00 =9-00 a sample
(Econ.
(3-5 (6:5 (5 (6-5 Find complete
OF E
- = =
E[x E(s?)}?
DISTRIBUTION (-3)2(-75) (-
75) 3)²
(- 2:1. theB.A.
(-3)² (6)² (6)²
E{s- of to that
Univ.
2= 3:5
(2+2) 35 65 6-5 VARIANCE
OF
$² standard
deviation
(a) student
- 5
= = s= = 5 = 8)=8 Sx=45 (p)
5)+ 8) =
Mean
(x)
5) 8) 2) (5 2) 5) IDelhi
probability
+ + + + + + variance
(2 (2 (5 (S (8 (8 (8+ ||-25. college
SAMPLING )-2 =2:25 =2-25=9-00=2-25
Sample 2) 5)
Values (2, (2, (2,
8) (5,
2) (5.
5) (5, 2) (8,
8) (8, 5) 8)
(8, =0
2:25 =9-00
=0 =0 s27
= (.
o'=6), )|l0125
population a the
= negative.
s²=(?+ 6-52 and takes
22)-3-52 -s?
HYPOTHESIS 3-52
Total
DISTRIBUTION
2?(22+ (22+82) 5² 6-52 s2 8² =3 0l
is
it What
No. 2. 5. 7. 8. 9. - - - - -2) -
S. 1. 3 4. 6. 52) 22) 52) 8?) 22) (82+582) of time
sample ?
of will
be
[From
(*)] (**)] the with (22+ (52+(s2+(S?+(82+ (82+ given =s=
)
by 279 (s²
estimate mean of taken.mean
-3 [(From (s²)=s2-E
900 amount
ofsize
is population
o²,
TESTING
AND
OF finding
sampling is a
ofmcan
Samplc
estimatc
-E(T)]- SAMPLING
2

s² s²
)#
students
unbiased has
variances the
population
5=
of
way
Values
Sample E( E( of
random 2) (2,
(2, 5) (2, 2) (5,
8) (5, 5) 8)
(5, (8,
2) (8,
5) (8,
8) Total an 64 the
deviation
OFTHEORY = =
ESTIMATION unbiascd 3= = convenient
sample not of than
sample
is Var A
random
mean.
population in s
less
9 the variance 16-6. standard
= (T) =
(x)
Var7 very S.No.
1. 2. 3. 4. 5. 6. 7. 8. of a
minutes
random
)x
E( Var a of
is
replacement. 9
mean
of
Example
mean
This variance Sincesample TheA 8
Aliter. The the (b)
than
minutes.
that more
maythat
byIt 0).
The 40
minute.
large, 4 30 91
probability
P(x<
noted
Gin). Probability all an
is downis 21
the
sample
size gare
given 2 3 4 5
u mean
Z-0 replacement.
N Normal Write
normal x01
+0-07Z Table.]
Probability
sample Total
Z=-16. 6. without
random
sampling
deViation.
We 5,
Sinccasymptotically
the [From ()
...
~N(0,
1) 4, that andwith ...
(i))
want symmetry]
[By 3,
Theorem, verify mean,sampling
900. We 2,
1,
Z=-21 symmetry)
(By[From
Normal and
values
nand
21,is
normal. 0-0764 X-4 standard sample
Limit population from (.N=6)
meanx with "C22!(6-2
1)N(0,
~ = Z= ’ <Z<l6) 6.
bc 0-4236
Central (2>-16)
population : the obtained
that 5, simple
in
= thc to isgiven =P(Z<007 =u-8 units of 3,4, 6!
specified 0:1
this verifyvariance
o
and of
by X-0-1
given - 143) (0 6 2,
distribution 0:5 by
0:1
0-07
(Z>= large, ois
the P X
(Z>=
and P+
of from and variance 1,
-12-25 -2
n
P(0<Z<1:43) population are
not is on)
X~N(4, the
=
u x +0-072Z<0)
P is X-!
0-5 replacement)
variance () of 6!
size
to x-01 2-1/30 = for the :
piven distribution
corresponding =
|:43) 64. and
mean
population
5 0:5 0-9452.
=
+0-4452 formula values =35
notations 91 l613.
theorem
the
sampling n- than 6 samplcs
are P + a mean.sampling
6440rNpP(l6<Z<0)
GNn Consider Ly_
21 2-917
- size = _X-#_X-u
-P(X>u-8) lesspopulation 6 page
we PO0-1 -P(Z<-
(without the
population usual
0:5 sample of
Hcrc withisvariance 17:5 number on
variablc - 16-7. its Tablc
(a) P(r<0)
=
the calculate Thethe
be
thc =0:5 2 size
agrees in
Solution. Since is
the Example
the the
limit standardised :
want p of of thisSolution.
havc, total
it in
(6) where
u we samplesAlso
estimate We The given
16-12 central .:. .". (a) (b)
and
as
16-13 observations lef
(l5
26|
in and
the(Intermediate);
1998]
Dec.
test
4-002-25 1-00 0-250-001000-250-000250-00 0-25 1-001002-254-0017:50 monthly
*
MEAN class 970
the =
E(x)
in whole
which 256
x= ?
(OF
DISTRIBUTION students (r-x)
10 0-5
2:0 |5 |0 05
)
0:5 0-5 1:0 10 15 2-0
0 in
the
in
VARIANCE } 970
:by N"(-)'3117:8571
8, 18 [L.C.
WA. given
x=52-5 sizeof students 40 64 ?=
Sanple
()mean
class
2-0 2-5 3-0 35 2:5 3-0 354-0 35 4-0 4-5 4-5 5-0 55 ofreplacement
without (r-x)
is
class
a the AND36 4 16
SAMPLING by S whole 174600
Sumple 3) (1,4)(1,5) 3) (2,
4) 5)
(2, (2,6)(3, 5) (4,
6) (4, 6) 6) of
obtained
2) (1,
values(1, 6) (2,
(1, 4) 5)(3, (3, (5, MEAN -3
56
marks 29 9
HYPOTHESIS the
970
xI80
Sample SAMPLE x_256-32
in
No. Total [Part
(a)] srswor. marks
total 35 3 9
students
8x7
2 3 4 5 7 9
10 11 12 13 14 15 2-917l·458
1 sample
(v) thethe 24 -8 64
8 18xx
970
- 84.
55
OF
(iii)
".. (iv)
".. of
1·167 thethe
..
x1458
1·167,
= in
) from
estimate
FOR
the -(*)
by (x random
() that of 576 Var
an
|From
TESTING
isx .
= variancc
(16-23)] [From
(ii)] =11672
Var drawn
CALCULATIONS 18,
38 6 36 x:
N= -V3117-8571
17:50 gIven marks
unbiascd 15
of
mcan =
,
-*
sample
mean
population
cstimate lcf=n =
(iv). srsworsrswr
simple
>
srswr 40marks
50,
24 576 ×
-N*
32
the is
AND 8 total18
35 estimate. -
ESTIMATION
OFTHEORY mean
srswor, in
obtained
a 36, n=8, =
the
of.r = in in in Fron 29, the 46 |4 196 Nxx Var (N.x)
of N-n.0. ) ) ) (N.
x)
Variance
15 the
implies
which distribution (x (x x 35, your of
In (
Var Var Var 16-8. of estimate
N-1 24, outof
*NC,
and =E(x)
Veri
(a). fication as
same Example
38, Mathematics
error
(r-x)? S.E.
Mean )(r
V'ar sampling
:fonnula (x)
Var (b)
Hence 08, standard
Solution x-x
Ihe
is 46, )
which
and are

You might also like