0% found this document useful (0 votes)
36 views41 pages

Statistics For Econometrics

The document discusses the importance of sampling and sampling distributions in statistical inference, highlighting how sampling allows for effective generalizations about a population without needing a complete census. It introduces the Central Limit Theorem, which states that the distribution of sample means approaches a normal distribution regardless of the population's distribution as sample size increases. Additionally, it outlines key properties of sampling distributions, including their mean, standard deviation, and the significance of sample size in estimating population parameters.

Uploaded by

Topu Rayhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views41 pages

Statistics For Econometrics

The document discusses the importance of sampling and sampling distributions in statistical inference, highlighting how sampling allows for effective generalizations about a population without needing a complete census. It introduces the Central Limit Theorem, which states that the distribution of sample means approaches a normal distribution regardless of the population's distribution as sample size increases. Additionally, it outlines key properties of sampling distributions, including their mean, standard deviation, and the significance of sample size in estimating population parameters.

Uploaded by

Topu Rayhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

t

Sampljng and Sampl[ng Distributions 471

would ghrc
completely
ors m8y bc
I errorsr
t 8 sample.
I sampling
would bc

lon survey
g
ry-scale c€nsul and surveys. The increasing awareness of the existence of such errors is due to the
u:;" widespread ue ofthe sampling method, one of the main advantages of which is that it provides an
;rm'cnunity for geater confol of non-sampling enors as well.

SAMPLING DISTRIBUTIONS
Much of the information used in business and inddstry is gathered by means of sampling. It has
:e*r pointed out earlier that'not only it.is often impossible either physically or because of limitations
::oscd by time or pecuniary considerations, to take a census of all the items in the population, but it is
r-.so usually unnecessary. The results of a properly taken ,sarnple, if subjected to rigorous analysis, will
i
:"dinarily enable the investigator to arrive at generalisations that are valid for the entire population.
fier extent
The process of generalising these sample results of the population is refened to as satisticalinference.
behaviour
ling enor,
: this chapter, along with the knowledge of certain probability distributions, we shall use certain sample
(such as the sanple mean, the sample proportion, etc.) in orderto estimate and draw inferences
tions, it is =tistics
l:out the true population parameters.
r both the
For example, in ord.ito be abte to use the sample mean to estimate the population mean, we should
sifuation,
:xamine every possible sample (and its mean) that could have occurred in the process of selecting.one
sample of a certain size. Ifthis selection ofgill possible samples actually were to be done, the distribution
E survey.
of the results would be referred to as a sampllng distribution. A,lthough, in practice, only one such
bulatiort sample is actually setected, the concept of sampting distributions must be examined so that probability
nmpling 'fieory and its distribdtion can be used in making inferences about the population parameter values.
Sampling theory has made it possible to dea! effectively with these problems. However, before
€ censuli
we discuss in detail about them from the standpoint of sampling theory it is necessary to understand the
CentalLimit Theorem and the following three probability distibutionq, their characteristics and relations :
omplete (l) The population (universe) dishibution,
(2) The sample distribution, and
dequate
(3) The sampling distribution.
Central Limit Theorem. The Central Limit Theorem, first introduced by De Moiwe during the
early eighteenth century, happens to be the most important theorem in statistics. According to this theorem,
ifwe select a large number ofsimple random samples, say, from any population distribution and determine
the mean of each sample, the distribution of these sample means will tend to be described by the .

normal probability diitribution with a mean p and variano e c2ln. This is true even if the poputation
dishibution itself is not normal. Or, in other words, we can say that the sampling distribution of sample
l
meails approaches to a normal distribution, inespective of the distribution of population from where.
sample is taken and approximation to the normal dis&ibution becomes increasingly close rivith increase.
;urvey,
in sample size. Symbolically, the thebrem can be explained.as follows :
, Wh€n given r independent random variablesXuX,Xr,......../r, which have the same disribution
(no matter what the disribution), then :

an the Iv =Xt+
-\ X2+ X3* ...* Xn ,

npling
^
is a normal variate. The mean p and variance d ofXue
enors It = ltt+ ltz* p3+.......+ !n= frlti
d and
-
.rlztal & = o,2 * ar2 i or2 +.......+ or2 = rol'
where p, and o,'are the mean and variance ofXr.
rpling
Thcutility bf this theorem is that it requires no conditions on distribution patterns'of the
rta in individual random variable being.summed. As a"irtu.tty
result, it furnishes a practical method of cbmpulioc --
--

approximate probability values associated with sums of arbitrarily distributed independent ra'

'r, ll
474w
is referrd to as the standard deviaim
r the measure of variability of the mean from sample to iampte and
of the sampling distribution of sample mean or the standard enoi of the
mean denoted by o; and h
calculated by*
o
_
o;=G
from finite population wi6
This formula holds only when population is infinite or samples'are
replacement.
make an assumpion abou
It may be noted that in deducing a sampling distribution, we must first
the appropriate population pararngter] In *
,nurti* any value can be assumed for a parameter, depending
limit to the number of samplirg
upon our knowledge or guess of the poputation, there is no theoretical
populationl fhere is a sampling dismibutim
distibution for the"same ilmple size that can be taken from the ..".{he followin
of aparameter, there is a diffcred
for each assumed value of a paramcter. Also, given the assumed value
sampling dishibution of staiistics for each sfiecinc slnpJe
size. Further, under the same assumptions (l) It has a m
about a population and the same ,a*pte rio, tt e distribution
of one statistic diflers from that of anottrcr
(2) It has a st
statistic. For example, the pattem ortrre distriuution of 7 will difler from that of s2, even though bo0t of the sample siz
measures are comiutid from the same sample'

where o, is
or simply stated s
o = standar
r = size of
deviation of the population values .
^!,- that it can safely (3) It is nonr
good an approximation ^L
In fact, the standard deviation of the samptes is usually so normally whateve
be used as an estirnate of tft. conespondingpopulation m:TlT'
ln ordgr to-use s of.the sample to
items are frequenl
to grcatcr
;;,i# ; . p"prr"uonr we make a slig[t adjusmrent which has been found to contributc is normal, the dist
"itt (n l) instead of n in the for thc
accuracy of the estimate. The adjustment insisl of using - formula
lt should be r
standard deviation of a sample, i.e., we use
estimate the tnre
original populatio
si = instead of
n-l the greater the var

rtre aaiustment decreases the denominator and, therefore,


gves a latger result'Jhus, the estimated enor in using i r

me observ,ed standard deviation of the sample' the smaller the sr


standar; Aeniation of *,. poput.tion is slightly targer ttrhn
In practice, t
Sampllng DistribuUon of the ilean deviation of the s
. If a population distibution is normal, the sampling distribution of the mean (f) is also normal for
place of o. Hencc
samples of all sizes as can be seen from the followinq
diagram : -'

tLctl,, rr'....., xrbc independent random variablcs' Qach


where s refers to t

ol-=var (;)= var l@a:n I The central li


that the samples s
(.r1)+ Var (ri)+...+Vu (rzD
n' [Var sampling is condr
,o2o Of
l1(,'=-
particularly, whcn
nx.ln6_ =7
correction factor
t',,'
v

Sampling and Sampling Distritutions 476

deviation
I\
r-r and is ,
, t
o=50
,
I fl=5
lt
tl POPULANON
1l DISTRIBUNot.I
rtion with I

tion about 2a

depending
rationship bctween a samprin g
Isampling
listribution
;;il"1"H"[:t;:i,$l]!ian{
a different .r,qfie following are the important properties of the sampling distribution of mean :

sumptions (l) It has a mean equal to the population mean, i.e., 1t; ='11.
ofanother
(2) It has a standard deviation equalto the population standard deviation divided by the square root
rough both of the sampte size. That is :

or=*,,ln
nean ofthe where o" is a measure of the spread of f values around [r or a measure of average sampling enor
lwever, the or simply staled standard error of lhe mean.
. o = sandard deviation of the population
ne standard
n= sizo of the sample.
(3) It is normally distributed. Thc distibution of sarnple means for large samples is distributed
t can safely
normally whatever ttre shape of the population distribution, provided o is finite. Samples of 30 or more
: sample to
items are frequentty considered large for statisticat purposes. It may be pointed out that, if a population
!cto grcatcr
is normal, the distribution of sample means is normal, even if the sample size is small.
rula for the
It should be noted that o" is a measure ofthe precision with which the sample mean 9an bc used to
i-
estimate the true population mean, [L the standard €Irotl o3 varies directly with the variation in thq
original population, o, irnd inversely with the square root of the sample size n. Thus, as might be expected'
the-greaier ttre variation among the items in the original population, the greater is the expectcd sampling
re estimated enor in using x as an bstimate of p. Also the targer the sample size, the smaller the standard enor and
'thesample. the smaller the sample size, the targer the standard enor.
In practice, the standard deviation of the population is rarely known, and therefore, the standard
deviation of the samples which closely approximates the standard deviation of the population is used in
r normal for
place of o. Hence, the formula for standard enor takes the following form :
s
V 6==7
aln

where s refers to the standard deviation of the sample. '


The central limit theorem and the standard error of a sample statiitic were based upon tlre premise
that the samples selected were chosen with replacement. However, in survey research and in business,
sampling is conducted withogt replaccment from populations that are of a finito size IV. In thesc cases,
particutarty, when the sample size n is not small as comparedto the population size lV, tfinite populat$n
conectionfactor should be used in developing the particular sampling distribution.
"-

h
)4-f,rl
--.l^

D*rtrfl
In
:eviati
l,aliSIi
tf
rt'the I
iistribr
If
cf ranc

sanda
TI
popula

where

ffi
Samp
St
the sec

i must bc
two inr
standard normal distributiorr. this normal variate
sizn n,
r-lr possibl

one of
P TI

- l.
P
t: 0.042 distribr
or P' [: > 2'38]'
gadprcquind
of 0.4913'lo thc left of the vatrcz= 2'38' To
Frorh the table, this value ofz = 2.3! concsponds to thc area 2.
probability, this arca of 0.4913 must bc subtractcd from thc total arca' i'e"
f' -4
o.riJz.tjri,lJ:[,lj-l;11'"]i-l;Tlrr. t{r sampre mcan wil bc grcatcr 0ran 2.r minutcs.
rhcrefore, io onrv

Dlstrlbutlon of Sample iledlans


with a mean p and r
lf a universe is large and can be approximated closely by a nonnal distribution (since
standard deviation o, the medians of random samples or
rir. r are distributcd with a mean p md r varianr
standarddeviationl.2S33olJi,andthedistributionofsamplernediansisnearlynormalifnislarge. t
A hypothes
prameteristtrcn
rypofiesis. I_fthe
Thus,thetwohy
The r_ejectio

Tests of H ypothesis the acceptancc


hypottrcsis, the t
o

Thc alrcmativc I
INTRODUCTION (2) Set up t
A hypothesis is an assumption about the population parameter to bi tested based
on sragb
suitablc level of t
information. The satistical testing of hypothesis is.the most important technique depcnds on the s
in satisricat inferm.
Hypothesis tests are widely used in business and industy for making decisions. specified before
It is here that probatri&y
and sarnpling theory plays an ever increasing role in conitructing the-criteria any lcvel of sigr
on which business dccisirr
are made. Very often in practice we are cafled upon to make decisions about population significance. Wl
on thc basb of
sample information. For example, we may wish to decide on the basis of we would reject
samit, Oot. whether a ircr
medicine i.s r9a]lr effective in curing a disease, whether one.training procedure havc madc the r
is better than anofrreq,
etc. Such decisions are called statistical decisions. only one bhincc
arc about 99PA c
In attempting to reach decisions, it is useiut to make assumptions or guesses
about the populatims .
-
involved. Such assumptions, which may or may not bi true, are called-srarrs c 0.5, the test r

general are statements about the probability distributions of


fical hypo,thesis and in , result is said to
the population. The hypothesis is
about the vatue of some parameter, but the only facti available to istimate
m& (3) Determ'
the true parameter arc thosc
provided by a sample. If the sample satistic differs from the hypothesis and its distribu
made about the populatioa
parameter,adecision must be made as towhetheror notthis
difference is significant. If it is, thehypothesis Test shtisti
is rejected. If not, it must be accepted. Hince, the term ..tcsts
of hypothesis,'.
Now, if 0 be the parameter of the population and is the estimate of 0 in the (4\ Detern
-
from the population, then the difference between 0 and 0 should be
random sample drawn
values of thc tr
small. In fact, there will be some
difference between 0 and 6 because 6 is baseo on sample observations and former is callcd
is different for differsrt
samples. Such a difference is known as rtiffhrence due to sampling that one attach
fluctuations. If thi difference between
that when thc I
e 6 is targe, then the probability that it is exclusively aue io sampling fluctuations is small. Differerpc
||d that ot/2 per ce
which is caused because of sampling fluctuations is called insignificant difference and
the difference lcft hand tail. T
due ts some other reasons is known as significant difference. e significant
difference arises duc to the intcrval. In gcrx
fact that either the sampling procedure is not purely random o. rurll" is not from
the given popuruiioi, a5 pu cont chr
Procedure of Hypothesis Testing (5\ Doing
The general procedure followed in testing hypothesis comprises the following steps : computations fi
wc necd to s€
'-(l) Set up a hypothesr's. The first step in hypothesis testing is to eiiabtish the hypothesis tobe tested.
Since statistical hypothesis are usually assumptions about thi value of some (61Ma*ina
,n[no*n parameter, the
hypothesis specifies a numerical value or range of values for the parztmeter. The dccisions. A su
conventional approach
to hypothesis testing :s not to construct single hypothesis about tire population pararneter, it. thc decisiot
but rather to
set up two different hypothesis. These hypothesis are normally referred rpjcotion or the
to as (r) null hypothesis denoted .

byHo,and(ii)alternativehypothesisdenotedbyHt. rndtho obscrvc


difierenoe ben
The.!91!.hypothesis asserts that there is no true difference in the sample statistic and poputation
parameter under consideration (hence the word "null"which means
sigrrificurtOl
invalid, void or amountingto nothing) is acccptcd and
and that the difference found is accidentat arising out of fluctuations of sampling.
not regarded u
L--:t'
502 BusinessStatistics

Type I and Type lt Errorg


. when a statistical hypothesis is tested, there are four possibte
results
Ino
: :nimig
!l !,r
(?)
hypothesis is fiue but our test rejects it.
hypothesis is fatse bur our test jccepts it. fficmptt
!h. ;robabiti
(1) Ih. hypothesis is true and our test acceprs it.
:5€ taft '

Q fne hypothesis is false and our test rejets it.


:y doing
obviously, the first two possibilities leai to enors.
accepted (possibility No. we say thataTlpe I
If we reject a hypothesis when it sho,ld b :ollows t
f ), enortras ueen maa.. on irr.--ot6g, ii.ia, ir". acrl xrhaps r
a hvpothesis when it should be rejetted
lposiiuilityNo.
either case a wrong decision or error in judgrnent has
,t;;;;;r';irtrtyp, tt errorhubccn madah la practic
occurred. rn favour
TWO KTNDS OF ERRORS IN rncrease t
HYPOTHESIS TESTING
Condition One-Tdl
Decision //o: True f/o : False Basi
Accept I/o Correct Decision Tlpe II Enor (0 tu
Reject I/o Tlpe I Enor Conect Decision Ttvo
fallingint
The probability of committingarypeleror is dcsigrrated as ..c,,'and
is called the leye t
Therefore, ofstgntfrc@u. only for v
tcst. I[, it i
o = P, [Type I enor] lefr tail, t
= p, [Rejecting HJHris trueJ testod aga
must be the complement of 100. Ths r
(l - a) = p, [Accepting.H/Hois trueJ. side of l0
This. probability (l - c,) conesponas til tte .oncept of 100 (l_ g)o/oconfidence interval. Our cffort
would obviouslv be tg railn! ;ry# I enor. Hcnce
construct the test to minimise a.
laye a imall probability oi the objectivg is o
similarly, the probabirity ofcommitting a type il enor is designated bv 0. Thus
, p = p, [T]pe II enorl
= p, [Accepting HdHgis false] ,

and
_ . (t - p) = p, [Rejecting Hotn[ is false].
(r - p) is known astie poneiof a statisticar test.
This probability
foflowing iable gives the probabililties associated with each
Jhe
previous tablc : --" of the four w'e
cetts .r
shovrn in thc
:'
Thc hypothcsis is
Thc deeision is : Truc Falsc
Acccpt l/g (l -a)
Confidcncc lcvcl '0
(l -p)
Rcject //O c Powcr of thc tcst
Sum t.00 r.00

L-.'
?

Iss4![ts$s!: sm

In order for any tests of hypothesis or rules of decisions to be good; they must be desigrred so is to
minimisc errors of decision. However, this is not a simple matter, since for a given sample size, an
attcmpt to dccr€aso one typc of error is accompanied in general by an iricrease in-.other type of enor. The
probability ofmaking type I error is fixed in advance by the choice of level of Sgnificance employed in
tre tcst. We can make the bpc I error as small as we please, by lowering the level of signifioance. But"
by doing so, wG inmeaso thi chance of accepting a false hypothesis, i.e., of making a tyPc II enor. It
s when it should bc fotlows that it is impossiblc to minimise both enors simultaneously. In the long run, erron of typc I are
:r hand, if we accept perhaps more likely to prove serious in research programmes in social sciences than are enors oftype II.
or has bccn made. In in practicc, onc typc of eror may be more serious than the other and so a compromise should be reached
in favour of limitations of the more serious enor. The only way to reduce both types of error is to
increasc the samplc sizc which may or may not be possible.

One.Trlled rnd Tno-Talled Tedr


Basically, there are three kinds of problems of tests of hypothesis. They include :

(f) nro-tailed tcsts, (ii) right-tailed test, and (iii) left-tailod test,
Tko-tailed test is that where the hypothesis about the population mean is rejected for value of
falling into cithertail ofthe sampling distibution. When the hypothesis about population mean is rejected
t level of slgnlficorce. only for value of falling into one of the tails of the sampling distibution, then it is kriown as one-tailed
tcst. If, it is right tail thcn it is called right+ailed test or one-sidcd altemative to the right and if it is on the
left tail, then, it is one-sidcd alternative to the left and called left-tailed test. For example,I/g : tt = 100
tcstod against tt z tt) 100 or < 100 is one-tailcd test since I/1 specifies that p lies on puticular side of
100. The samc riull hypothesis tested against llt : p + 100 is a two-tailed tcst since p can be on either
sidc of 100. Thc following diagrams would make it more clear:
interval. Our efforts TEST
e the objcctivq is to

us
RE,lECTIOiI ACCEPTANCE RA'ECTION
REO!ON

r cells shown in thc RIGHT.TAILED TEST

ACCEPTAT.ICE
REGION RE ECTIoI.I
REGION

.,2.
LEFT.TAILED TEST

rnd the elements inthc


ment. Howcver,cand
l/hen n is fixed, ifo is ACCEPTA'{CE
h o and p to decreasc.
BEJECTION
REOION(1.q) rl
rld decide howmuch
erto rcduccthssizcof
-2.

a.._ l-j -ja:--_,_,_*r,_-i.]].],__


5(H BusinessStatistics
and two-tailid tests at various
The following table gives critical-values of z for both one-tailed Since the bes t
are found by use of the table of
levels of significance. C"ritTcal values of z for other levels of significance on the sampling dis
normal curve areas :

0,10 0:05 0.01 0.005 0.0002


Lcvcl of
- l.2t - 1,645 -2.33 -258 - 2.E8
Critical valuc ofz for onc'
or 1.28 or 1.645 or 2.33 or 2.58 or 2.88
tailcd tcsts
Critical valuc of e for two' - 1.645 - 1.95 - 2.58 - 2.81 - 3.08

and L96 and 2.58 and 2.81 and 3.08 where


tailcd tcsts and 1,645

Tests of Hypothesls Goncernlng Large Samples


large and small samples, it is
Though, it is diffrcult to draw a clear-cut line of demarcation betrrien
as-a larSe.saSlle' The tests If the'calculu
generally lgreea that if iie size of sample exceeds 30, it should be regarded
forthe rcason
Irrig"ifir"ice used for l-arge samples are difrerent frorp the onesused for small samplesr hypothcsis (D) If trc h)'pd
that the assumptions ,"-k. in case of large samples do not.hold for small samples. Tests of
*.
involving large samples are based on the following assumptions :
normal. For the cal,cu;r
The sampling distribution of a sample statistic is approximately
(l)
(2) V-alues given by the samples are suffigiently close to the population value
and can be used in its (c) tf the hiur
place for the standard error of the estimate. _ ._ hypothesii bascd on
. Thus, we have seen that the normal-disribution plays a vital role
in tests of-EL_.-^^L^.
then for dte -c :
tuge saqnptes (cenhal limit theorem). ' ,
"z*
Ilhtrtti.. I :!
is an unbiased estimate of 0, the population parameter. On the basis of 6,
taken from strdrrd&viooe:f S
Suppose 6
,u*pt.'oUrrrvations, it is to test the hypoihesis w-treitrer ttrc sampte is drawn from a population whose Sehti* Th! !d[
rnelrr, i.. , ,i: - ' -f
parameter value is 0, i.e.,we have to test the hypothesis
,

,. Ho:O= 0
If sampling distribution of 0 is normal, then
-lr:
g=9.
,= 05r
-/v(0, l),
tables ofarea under the standard
Let us test the hypothesis at t00 o% level ofsignificance. From Tlccrcn uur
no*.t .u*. conesionding to given o. we can find an ordinate zo such that :rtrrgam -ry:n"
Prllzrl>z*l=o :.6m hr
P7l-zs. 3zs zol = I -o', -
.0 f , tt.n?., = 2.siland if cr = 0.05, thcn zo = l'96'
and so on' Tedng l,Wn
' IfIf othe= difference fr.1.,p;; il;; it *on than zi times, the standard enor of 6 , tht difrerencc is ThEs ro@il
and if the difference betnecn 0
recarded simificant and r/^ is pjected at 100 o% levi of sigrrificance n_ormdty distrful
and lro is
;;i d; ffit'h.n.o,1il.r1;;;il;; the srandard enor of 6,thc difrerence is insignificant
accepted at l00a% level ofsignificance.
Testlng Hypotheslr about Populatlon ilean
p by considering where9=A-ry
(a) we shall first take the hypothesis testing concerniigthc population Paramctcr
the two'tailed p.l fr -r-:. 05' &cl
l.,ii _, tFo is hypothesised vatuc of

rscc Chaptcr t 6 on Small Sampling Thcory'


Tests of HvDothesis 5(E

cd tests at various Since the best unbiaqed estimator of p is the sample mean x-, thercfore, we shall focus our attention
r use of the table of
on the sampling distribution of ;. From central limit theorem, we know
-. t \
0.005 0.0002
y, - N (P,oiJ
- 2.58 - 2.t8 x-p -
r 2.58 or 2.88 z= ox
- 2.tl - 3.08 ('
d 2.il and 3.08 where or= [f o is known.]
{n
s
.ln
[f o is unknown for largc samplcs.]
I small samPles, it is
ge samPle. The tests If the'calcutated value ofz 1- zrn ot) zal2rthc null hypothesis is rejcctod'

rplesf forthe rcason " (r) If the hypothesis involves a right'tailcd test. For examplc,
. Tests ofhYPothesis
Ho , 1.3 po and If, , lr, t o.
For thc calculated value s ) zo, thc null hypothcsis is rcjcctcd.
urd can be used in its , (c) If the hypothesis involves a lefr'tailed test' i.e., .t

Ho:F) Poandllr:tt<Po
hypothesii bascd on
then forthe vaiue z1'zr,the null hypothesis is rcjected,
Illurtntlon l. The mcan lifstimc of a cmplc oftub6 produccd by r compmy ir found to bc 1,580 hout! tryih
100 light
sis of 0, taken from produccd by tlrc comP.tty ir I hour.
standffd dcviation of g0 houn. Tcst thc hypothcsis tlrat thc mcrn lifctime of tho tub6 '600
ttrd hypothoticd populrtion
r a population whose solutlon, Thc null hpothcsis ir that thctt is no sigificart diffcrcpcc bctwEcn thc !.mplc mcrn
mcEn, i,r,, flo , l, = 6'and H, i F * ltg.

z=
r-F
o;
t
whcrc ar- [Sinco o.il unknovm for lqgr rurylcr.J
'7
ea under the standard
Thc critiol - + I .96 for a twouiled tst rt 5% lcvct of rigrifiarncc. Sincc, thc computcd vrluo ofz'-2.22 frttr
valuc is z
i" tft. *r r.i*t thc null hpothcsh Hcncc, tltc mcur lifctinc of ttr tub6 Ptoducod by thc aomprny mry m bc
"ji.tl*igion,
1,600 houn.

Terflng Hypotherlr .bout tho Dlftercnco between lwo Mernr


of 6, the difference is The test statisticfor tcsting the difrcrcncc botween two.population mcans'
whcn the populrtions rre
r difrerence betwcen 0 ,I.*;lly;iruiiui.i, ir bascd-on thc gencnl form of thc standard normal statistic as Siln bclow:
insignificant and llo is
z=-0- 0
meter [r bY considoring where b=p,-pl.since,l,.u.rtunuLliornirnorofOllrl-eis r-r -fi, thercfore,0 isroptacodby

fr - Ir. o5, thc standard dcviation of thc sarnpling distribution of (tr - i, )is Srvcn bY
pothesised valuc of P'l
o,' . d-
o3 = oi,-r, =Var[r-r -IrJ =Vu[i'r] +Var [iz] -
T*'r,
606 Bueincec Statistics

Tlrercforc, thc z strtistic is given by

-i7 - pr)
z=
G'2 6,12
.-L- + -L
t\ tt2

Thc null hYPothesis is.Hot Pr- 4= 0

Thcn, the z statistic is reduced a =


' ffi
rl
AtS%level of significance, the critical value oiz
;:;'
for two'tailed test = t l'96' If thc computcd
Telt

ia.ml
then rcjeit I{; othcnrisc acccpt Ho'
valuo of z is grcater than +1.96'or less than -1.96,
In case o,2 and o] *not known thcn for largc samples, s,2 and sr2 can be uscd instcad''
Illurtntlol2.Youaltworkingasapu'ghucmansgcrforlcompany.llrcfollowinginformetionhubcarrrrpplicdtoyorr
by ruo murufroturcn of chcrio bulbs I ropL
ComPrnYl CoinPrnYB

Mcur lifc (in horrs) l'300 l'2tg


Studrd &viation (in hourc) t2 93

o p,lr;lL if pu dccirc o trkc a riek of 57o ? (MBA' Kwuwt Untv" 2004


whiclr bnnd of bulbr aro y91r going
ofbulbr' Lr"
golutlor. kt ru trlcc thc null hnothcrir tlut ttrcrc ir m dgnificant difrcryncc in thc qurtitlof thc ttvo brardr
flO:lrt-h
ot 2 orz rrc not known, tlrcrtjuc,-crn bc rcplaccd bY s
: ,rd t:
[Sincc
^d
-
.r--
Ir -x2
Id..d-
r300- 1288

lCId.e[
I't'a12 I roo ' too' I
. _@_ - ' -!2-'=g'95g
rz':99_-..--
m.kiE
- oi' - I '96 (5%
_.
lcvcl)' we accc'pt thc null hypotlrcrii' HaWa
rcsulE
Sincc on conputcd vrlus of a ' i; to. thttt
0.9Sg
"ilitd
of Uom do not diffcr riErificurllv'
"f* s
ilt qq.til;f t*i[nOt
Tert of Hypotherlr Conccrnlng Attrlbuter
possiblc in
quantitative mcasurcmcnt of a phcnomenon is
1
As distingrrished from variables where oomplc' t
;;"i;;J;;;;;f;*noc or abscncc of a ccrtain characrcristic. For
casc of atributos wa
in the strrdy of attibntc 'cmploymcnt' . r.tpto
i"y be.takcn urd peoplc classiflcd as cmploycd and
bc formod' Thc sclcction of an indi'
gncmployed. With such dat& ttrc binomial Oi.
u, callcd 'evont', tt.'.pp.o**:-f T
r.i "ii.Uf't"y ffi]* 'l'
may bc takcn 8s "succcss"
vidual on sampling bcln8 a T
and its non-appcarancc, as "failure'. Itrc *riiins
digtibution of lo.yt3.t'fj:ttp'
its standard deviation a 'lnpq' - s
;il;;#bility modcl would have ie mo8n p = ttP dnd rej€rtc

,= ry-iv(o,l). TrrU
Then
dio ir frir !t 5% lcscl
poinu rppodrcd 360 titnct' Worrld you sty that tlp L
Illurtretlol 3. In600 throws of ,iorJiffi ooo respa
ofrigniftcrncc ?
Teetgof Hpothecis ESf

Solutlon. l*t us takc thc hyAothesis that the dic is not biascd.

P= c- !z,lo5(x). nP=3"N.

Applying thc formula ;

. ,-G=
t-nD 360-300
=ffi=r.r. 60

160oxi-x;
Sincc, the computed value of z is grcater than thc tablc valuc (1.96 at 5% lcvel of significancc); thc hlfothcsis is r{cctcd.
Hcncc, the die does not scem to be fair.

Testlng Hypothorb about a Populatlon Proportlon


The population parameter of interest is population proportion.r. If thc samplc sizc is lgrgc, thcn
= r 1.96.If tho comPutod sample proportionp will be approximatcly normally disributcd. Then
risc accePt llo'
,= ,:, - tv(o,l).
an be uscd instcad;
kmatiqr hu bcsr ruPPlicd to Yott Thc nulihypothesis is that *.r]i, no sigrificant difrercncc bctween,the samplc proportion md
population proportion, Le., Hoi p = rt
Since the sample proportionp is unbiased estimator of n,

,=
(MBA' Ktllllalrrl.ltatu"2N2l firerefore, the statistic
Trwhereol= Fit")
*t6n o15ulb& Lf ' I
nlitlof thc tlvo
z= -,t -il(0, l).
n(t-r)
n

lf I z | < z,,the null hypothcsis is rcjecrcd with lOdo7o level of significancc.

Illuilrrtlon 4. A salcs clcrk in thc dcprrtncntal rtorc Claims thrt 6096 of thc slroppcrs cntcing thc rtorc lcew without
making a prchasc. A random srmplc of 50 shoppcrr shou,Gd that 35 of thcm lcft without buying rnything. Arc tlrco ramplc
** rcsutts consistcnt with thc claim of thc salcs.clcrk ? Use alcvcl of sigrificancc of 0.05.
(MBA, D.lht Untu,, Dec. 1998)
0rc null hypottrcril'
,, u,t roept Solution. Thc null hlpothcris is
I
Ho tl- 0.60.
t 35
ossiblc inl Thc ramplc proportion p- 16'
-0.70,

ff,lffi:H:'i#i",n'l
as cnrPloYcd
and
UsinE thc z stttistiq wc hrvc

2-
d
p-fr 0.70-0.50
- 1.45.
oote classificd
;ffi:* sclcction of an indii fn
T,
(t-t) Fl
[p
./, rlsy bc takcn as "succcssl
( Thc critical valuc of z is 1.64 at 570lcwl of ripificmcc
; il;; of succcsscs' bcing b.
Sincc, thc computcd valrr of z = 1.45 is lcrs thur thc criticd tiluc of z = 1.64, thcrtfoN", the nutl hypothcrir crluot
arddcviation 6- lnPq' i rcjcctcd. Hcncc, brsed on this ramplc drtq wc clrnot rcjcct thc clrim Of thc r{cs clct*.

Tectlng Hypothetb about the Dlfferenco Botwoon Tlno Proportlonr


ir frir rtt*t"1 I*tp1 andp2 be the sample proportions obtained in largc samplcs of sizes ns,8Dd n2 dnwn from
totrldyou raythatthedio
respcctive populations having proportions fi1 and t2, We can test thc null hypothcsis that thcrc is m
508 BusinessStatistics

difference between the population proportions, f.e., Sm"


' Ho: f\ = IEZ,
qrcrl
h
As shown in the earlier chapter, the sampling distribution of differences in proportion,p, -p, is cr UrcU
normally distributed with mean atn*m1
- urr- rr= ltl - frz '
5t
m Oq; r-.
' and standard deviation
hi re

6p p2
l-lgr) . rr(t -nr)
n2
Tteq
Therefore, the statistic is

Qr- p)-(rcr -tr2)


lt (l- lt . n2Q-*2\
nl
*T n/batl

If the null hypothesis is true, p1 and p2 are two independent unbiased estimatorS of the same param-
eter 7rt = *2= r. Thus, our procedure is to pool our observations to obtain the best estimate of thc
common value fi. The pooled estimate of r is the weighted mean of the two sampl€ proportions, i.e.,

p=
\A *hPZ
f\ *n2 Sincq ttcr
Our test statistic then becomes
s rcjccrcd tk

z= Pt-h, whereo = -p ,t t\ Illutrrlic


6pt' Pt-Pz
,, n2 WccklyEr
Illuilrrtlon 5. ln a random samplc of 100 pcrsons takcn ftom villagc ,{, 60 arc found to be consuming toa. In another No. ofpcrr
qgnplc of200 pcrcons takcn from villagg 8, I 00 persons are found to be consurning toa. Do the data rcveal significant ditfcrcncc ls it iikcty u
bctwccn thc two villagcs so far as the lhbit of taking tca is conccmed ? (MBA. Delhi ltnlv., 1999) Sohffor.
Solutlon. I*t us trkc thc hy2othads that thcre is no significant diffcrcncc bctwpen thc two villages so far as the hrbit of lleeHykni4;
taking tca is conccmcd, i.e,, f ,-x, (Rs. hurMl
Wc arc givcn
rt0
:
. tG20
100 2U3A
3G40
40_50
50_60
6trzo
70_80

!?l!L-).- rltx2 60+ tclg


wherr
\+n2 q+nz - t00+200
- 0.53

0.6-0.5 0.1

@ffTilmm:rri
0.t.
Jo.oorz

-
sm
I::EelIuS:e
dhc criticd value ofz = 1.96 at 5% lcvcl ofsignificancc, thcrcfoq
$B lc0spt lhc
Sincc, thc computcd valuc ofz is lcss than
tno villagcc ,{ md 8'
Ugo5oi*'ff*o, o *"fra" Oat thcre is no sigrificant diffcrencc in thc hEbit of taking tca in thc
p^coplc wcrc found to bc
Ilturtrrtiol 6. Bcforc an incrcagc in cxcisc duty on ter, 400 pcoplc out ofa samplc of 500

tae drinkc$. Aftcr an in.t"is. in duty, 400 pcoplc wert tca a**cn
- in a samplc of 6(X) pcoplc' Statc' whc&cr tlrcrt is r
( MBA' Dclhi Univ" 2ffi21
proportion, Pt-Pzrs sigrificrnt dccrcrse in thc congumption oitea.
is no significant decrcasc in thc consumption oftca altcr lhc incrersc
Solution. Lct us takc thc hypothcsis that thcrc
in duty, i.c., trl = tr2. ?
rr 400
'Wcrrcgivcn Pt= i= ffi= 0't,n,o5ffi'
n2 *600 -0,66?,nr*5(x).
o^- ?=
t'2
The appropriatc tcst statistic to bG uscd hcrc is givcn by
Pt-
za
,,-rr[*" 6
po
qpr.n2pz
= I!!L = 400+400 =0.23
irylrcrc
\+n2 \+n2 500+600
0.t-0.667 0.133
..lto.zrlo.zzxo.oorz)
atorS of the same Param-
the best estimate of the
rmple ProPortions'
i'e'' , 0.133
= Jo.ooozr= 0.027
0.133
=h.g3.

Sincc,thccosnputcdwlucofzis grcatcrrhrnttrccriticdvalucof z=l.g6atlYelcvclofsigrificancc,tlrotforc,hlpdlrcsis


,s *j;;:Ii;;;ifu-t a sioificant dccreascrn ttrc
m autr'

ffiffi:4ffi,!rcrcasc
-I
-+-
\ nz)
I
Illurtntiot 7. From thc following dau obtaincd ftom a samplc of 1,000
WccklyEarnings(Rs.hun&cd) : flO lC'?0
p€rsotls' calculatc thc standard crror of mcan

20-30 3f40 4f50 50-60 6G?0 7rt0


:

wockly camings of Rs' 4'200'


ts it iikcly that thc samplc has come fom the population with !n avlragc
rr"Tfffihr'Liltx SohJtioa. CALCULATTON OF STANDARD DEVIATION
.(x- 15Yt0
;r*o villagcs so
lb as thc habil of WeeHy fuinings x" I d Id
(Rr. hundtud,
0-10 5 50 4 -200 t00
-300 900
r0-20 l5 100 -3
-300 600
2r30 25 t50 -2
3H0 35 200 -l -200
0
200
0
45 200 0
40-50 +1fi)
+l 100
5H0 55 100
+2 +2fi) 400
60-70 65 r00
+3 +300 900
7H0 75 100

n=l@0 =-4fi) tSl*,tw

i =A+ Ltfd xi=45- 400


x l0=41
1000
r'-
t-t mo\2
J-
n
-(fll "=
3900
r000 l. rooo /
xl0
(0,015)
= 1.934 x l0=.19'34
- 19'il -19'34=0.6t2
vr
--=+=
Jn JrOOo 3t.62

fl
510 Buainess Statistics

',:U:ii_i'',,.,,.
. . sincc' thc com,urcd :*:r-,r.*' ;l- ;Ef*ffi = +r.e6,.ir is nor sigrificant and hencc thcre is m I
lflTffi',:'fffiff"Tffi#i,,iler;;;;;'
Illu$ntlor t' A samplc of400 managm
-;d#i'.L#,i.u.*,r.,-,n* and the differcnce courd haw M
is found,to-lr1ve a mean hcight of
a samplo from a rargc popuration'of mcan r7r.3t cms. can it bc reasonabryrcgardcd
r,Leii-rii.ii *s md standard deviation of 3.30 sms ? rs
Thenull hl?othesis is thatthcre isnosigrifican,o**
,r,*,,il|$t,o"
""r*,;;;;.*heightudtlrcpoputatio
Girrcrr
i = l7l.3t, p- l7l.l7,a = 4fi), and o = 3.30
Applyinj rhe tcsr statisric

,= i- u - 171.38-l?l.l? o.zt

.._..sin:,:thccomputedvalue.ofz=;;m;ffitl,,|),*",ofsignificance,thcrefore,thcnull
hpothcsis is acccpted' Hcncs thcre ir
""
rrs,ifir*;;fil; ;.^*; ,h" rJr'i" r*, height and popuration mcan lrcighL
Illurtntion 9' Intclligcncc tcst givcn-to two groups of boys

Girrs *'\f'* ard girls gavc thc


s'o' folowing informarion
Number
:
l:r
Boys ?o ll' 50
riq
Is tlrc diffcrcncc in thc mean scorcs
of boys and girls
solutlon' La us takc thc hypo0rcsis tlrat
srafisti$ily si; **, DU,tggn
ficant? (L,BA, s.y. Itniu.,lf;rr, :m'i j
thc dilfcrcncc in thc mcan score
of boys and.gn* * *;,rri;-;;: ;;,;:,r':'\:
wcaregivcn r-1.=75, iz -7ost2=100,pr2= h4,nr=.Jg,ar=100.
The appropriatc gtathtic to bc uscd :,,ti,iml
f,.r.iriirri iy

z=
-rz) - fnr - ttz) -i2 fsin* o12 = s12;sr2 = q,?l
2
g- 2
s2 [and
p1 = p,
n1+zn2
J
nl n2

:49- i c

troo t+l-=m=fu=2'695 !
{lo-"m' ffi
sincc' thc computpd valuc z = 2'695 is grcatcr than thc criticsl !nar
valuc ofz = 2.5E at l7o level of sigrificancc,
hypo0rcsis is rcjected' Hencc, thc tlrcrcforc rhc
diffcrcn; t ilr;
rcore of boys and girls is srarisrically significant E
Illurtntlon t0' In a survcy of buying habits, 4fi) womcn \q
shoppcrs arp choscn at random in supcr
tttcclcly food cxpcnditurc isk' i50 uiri *t"r,a.ia-driauo, marlcet ,,. Thcir avoige ll
random in srpcr markct g locatcd "rnl.
+0. i"i*rtr,*
group of 4CI womcn shbppcrr choscn
at
in anoficr arca ofthc-s-ame city, tfrc ,rragi-r*rUflilod cxpcnditurc
is ns. iZO-*itfr . standard

ffiHflHircst
at l7o tcvcl of siSltiftcan;, tr," rr"og"
"'i"tr,.,
d*ii
rria'*p.roiturcs of popurations thc of*omen n
SolrrtionThcnullhypothcsisisthatthctvcragcwecktyfoodcxpcnditurcsofthcttopopulatiorrsargsamc
of (rhcpopulation variurccs) arcnot knowq nrc cancstimatc rrun urcsampt.
,* J,ffi;t;ry ruri-.., rprJi;J:#;
'cr2
,,2, - gr2
r'
or,-
lii
Givcn : z,
,.:1T: ,17|f/f., t-r =250, i2 =220,sr=40,sr=55 lt€&
Applyrng thc tcst stEtistic
II
t. _;^ \{e6od I
2- [sincc o,2 = s12; or2 = sr21 rr.lrs qf
ld.d_ tLse nq
E
lq' n .I
(
proert S

l
!
Tede ol 511

=ffi-tf#=#
rrcckly cxpcndit,rc of two
is irjcctcd. Hcncc, thc averagc
,n*r, o."* *lnnh "rnffi***
p"pd;iiil ;i;; shoppcrs diffcr significantlv'
of rhcsc 25ra5 yicrdcd
cittr€r 4 0r 5 0r 6' rs $is consistpnt
with thc
throrryn 4gr52 timca cnd
Illurtrrtron t,. A dicc ir
try;n:f,fff'|H:*ttcrmcdrinrccas,thcnrhchutthypottrcsiscarbGstat'dastrattlrcdiccisunbirscd'
ffit*H:[xi,t.x:
as
r itbc rcasonablYrcgardrd
Thc approprirtc statistic
to bc uscd is
car hciEht a$d
thcPoPulation D_ I p_ fr
ze @_ <-

le<r_el
"{--In
-ffi=ffi=6'o -

I 4e152 sifrincut ana hcseforc, null


Qnuqr value of z = 3'' it
is
iluc
valuc of z = tjy.l...ffi
ulc critical
uan thc
6 is much grc8tcr than
l
;fi''siil$ffi3ff11' Sincc, the oomputed :,U"t'
'ry5,:IJ#jn'x.lli"ll:;tril';''ffi
nformation:
Nrnber
50 **
*ur. tiTi.iii J i,i .rti,ru ttt
*1'!* titr t:
*,*#ffi ffi
"gott"v
Slg,ffi ffi $*
u niv',tffu
Du'$sn fl,l*ifmxH #,* ""
o,] X;::ilil** ; rhc rigr'c obscnrod and
ttre ngurc

i. Y,
"
rls is not sigprficant'
" ^'i'e" ltt = Pz'
claimo4 i.c., 9.3 and 8'9'
"''"Tt;gi'*'r^=t'9' t =9'3's=l't'r=50' I

i00. bc uscd is giwn by


Thc appropriatc ststistic to
td =l
"
r-1r forlargcsamPlcsl

[sino" o ] =r|;o* =t z=-


o/.tn
1
[mour =Pz
=e.i-1f
r'sl$0 =$.=r.f t:'-'-l"fl.g sigrificancc"
lT *" critical valuc of z = l'e6
Sincc' the cunputcd valuc
of z = l'5 3,3i "
ls f liliiiit*ii' Aif'n'no Utt"*n tftt-"",ogo frgure
obsm'cd and
thcrc
hyporhcsis ,, ..r.pi.a. Hcncc,
thcrcforc, the
trtffiillt"'55tis,osscd,0',imes-n,#ffiflf$fl:ffimli}llffi
#lffi '
tcvel of siglificancc'
thcrpforc' tha ,"',Hlii$;:m"f,H,m5f iil'JlIi"

ffiffi
7o

arc givcn :
o bG usad hcrc is z'statistic' Wc
Thc appropriatc statistio
""Tr--'. P4'3' n=0'5 andz= l(X)

two populations
arB s8m€'
i'c''[r1 = Pu: *ffi=ffi=-lt*'o=-n'
sarlplc ,l-;- 1 tOO lolo lsvcl of significancc,
tlrcrefsc nE
H il';; ;"'6nscs lProvidcd
Sincg thc courPutcd
of = I I i' gt*t
th* oitical valuct of z = t2'58 at
"J* '
"'*}mm'ffi ,oo *, il'I,iti'**
t' rot'ii"*t
'$iltrlfi.*ill*Hiliffirg""r''ouioacrrcs*atthcprorinor
Mcthod 2 ttrc csnspodiry

gtnt t#i;;a " in rhc prod-r* Pfcftr$ta


is
M€rtod I has srmplc,o*,il*;,Jr*r** ,.o ,o
valucs of mcan r,-*ofii'**-#loolcvcl
,,,r
q s%
ru,
or 'iilnoi*a' #; il;ffi-r it uab & 5
usc rn appropriatc largc
"'ilr"Lj 'iJi'iliffi';';{'q!ht
= 512;
or2 = sz2l
;r.d,r",ild thc null hypothesis'
"lcarly
I
s12@
SolutloE. Lct thc null hypothcsis be that therc is no significant diffcrence bctwccn Method I and Method 2, i.c.,
flo:Ft - h
wc are givcn
;, - 106, i2 = 106, s, = 12, s, = r0, nr= nr=64

Using thc test statistic e =


- (r' :l?o

+ --.!u
nl trt
:fr
x. -x- 106- r00 6 tt U U4I
= -==!==J:. = = =3.07.
,1244
'lgf .gf.
J+.* 64 t64
Sincc, thc computcd valuc ofz = 3.07 is grcat* than thc critical valuc of z = 57o tevcl of significancc, the null
,l.64at
hypothcsis is r$&tcd. Hcncc, Mcthod I is bcttcr than Mcthod 2.
lllurtrrtlon 15. A company is considcring two diffcrcnt tclcvision advcrtiscmcnts for prornotion ofa new prcduct.
Managemcntbclicves ttiat advertiscmcntl is morc cffcctivc than advcrtiscmcnt 8. Two tcst market 6reas with virtually identical
I
consuncr chsractcristics ars sclectcd : advcrtiscrncnt is uscd iri one uea and advcrtiscmcnt I
in thc othcr arca. In a random
ramplc of 60 customcNr who saw advcrtisement l, tt
tricd thp product In r nndom samplc of 100 customirs who saw
rdvcrtiscmcnt 8,22tricd thc product. Docs tpis indicate that advcrtiscnrcntl ismorceffectivcthan advcrtiscmcntB,if a|o/o rcjeG
lcrcl of sigtilicrrrcc ir usd? (MFC, hlht Uniu.,l96; MBA, Delhi Univ.,20(M; MBA,IGNOU 2002)
Solution. Lct thc null hypothesis bc that thcrc is no significrnt diffcrcnce in thc effectivcncss of the two averE
advcrtiscmcnts A and B, i.e., flo: frr= \.
Thc appropriatc statistic to bc uscd is

,= @t-Pi'@r-tz)
opr- ["'ns = rr1
n
p(t'p)
[+.*,J
rr lE
whcrc Pt= ;i = oo = o'30, n' = 6o
x7 22
,l = m'=o'22.4r= loo i.e., h
''
ntplP. n2p2 _ rl +a2 40
ahd P= \+n2 qtnl -160 =n?(
0.30-0.22 0.08 0:08
z= - Jdm - 6571= 1,13
il(o.2sxo.zs) |.a. *J Hencr
Sinccttrccotnputcdvatucofz- l.13islcslthurthccriticalvaltrofz-l.645ratSTclcwlofsig4ificmcc,thcrcforcthc
null hpothosis is acccptcd. Hcncc, thcre is no sigrificant diffcrcncc in thc efrcctivcncss of thc two adwrtiscments I and 8.
" Itlurtrrtion I5.500 units from a factory arc inspccted and 12 arc found to bc dcfective, t00 units from anotho
and fe
who h
frctory arc inspcctcd and l2 are'friund to bc defcctive. Can it bc concluded at 5% levcl ofsignificancc that production at as har
sccond factory is bcttcr than in first factory ? (MBA, Kurukshetrq Univ., 1996: MBA, DU, 2002\
Solution. I*t us, take thc null hlpothcsis that 0Er€ is no significant diffcrcncc in thc proportirxr of dcfcctive itcms in the
two haorics.
.rr '12 x2
simila
sigdfi
P,= i = 3*= =0.024; Pr= n2
I

z= i
I
Hs p
P$ P nl + +)

in hl1othcsis, rtc usc two-hilcd tcst and thc criiicat valuc ofa at 57o lcvcl is 1.96. ln lhis qucstion, onc-
'Nonndly tcsting
trilcd tcst has bccn uscd and thc critical valuc rt 5olo is 1.645.
''-!

Tests of Hypothesis S13

uti;h
rhod 2, i.e.,

z=: 0.024 _
=
#
0.0t5
=
0.009
#*fu =oor8
--:--
= ----=l.lg4.
(0.00325) 0.0076
J(0.0t8) (0.e82)
Since, the computed value ofz is less than the critical value
. ofz = I.96 at 57o level ofsignificance, therefore, our null
hvpothesis holds good' Hence, we cannot conctude that thc producrion
t;; i;;;; ;?;;;il; i,f not r..tory.
in rtrc
.Illustntion 17' A buycr^of electric bulbs bought 100 bulbs cach of two famous brands. upon testing these he found that
.brandl had a mean life of 1500hours with a standaii deviation of 50 houo rrrr...* Lr-d g had
a mean life of 1530 hours with
a standard deviation of60 hours. Can it be concluded at 5 per cent tevet orsigiiircance
that the two brands dif1er sigrificantly in
quality ofrhe bulbs.
Solution. Let'us, take the nutl hypothesis that the two brands ofbulbs
do not diffcr significantty in quality.
We are givcn .r-l = l5(X), i2 = 1530, s,= 50, sr= 60, z,= 100, nr= 100.
occ, thc null The appropriate statistic to be used herejs
UvSn bf :
xt-xz 1500 - 1530=_.-_
2n
rcw product, =_3.94.
rlly identical t' +
irsoY (60)2
7.91

In a random tt1 ll tob-.,Joo


rs who saw Since, the computed value of z is morp than the tablc value
of z = 1.96 at So/olevel of significance, tfre rlull hypothesis is
at ifa5%
8, r{ected. Henc9, rhg^tw1 brands of bulbs diffcr sigrificantly in quality.
iNou 2002) Illustrrtion l8' Two types ofnew cars produced in Inaia,are iested for petrol
mileage. one group consisting of36 can
of thc two averaqed
.*
!4 kms' per litrc. While the other group consisting of 72 .urrug.d 12.5 kms.ler litrel
(a) What test statistic is appropriate, ii
-

(6) Test, whether there e-,r* .:L'";oLt"ilH#: lihe petror consumption of these two typcs orcarc (use a = 0,0r).
. ( MBA, llT Roorkee, 2000)
Solution. We are given the following infurmation ;
1

nr= 36 ,-l = 14 or2= t.S


(9)P. upp.priate
(D) Let us take the
test statistic 3;'i" * *J;ii;i*-",. o;'i.],i 1,, ,,*..
null hypothesis that there is no sigrificant difference i"tt e p"trot consumption ofthe turo typcs ofcars,
i.e., Hot pl = F2.

t4 - 12.5 t.5
'o,2
4+.g o.2 lr.s 2 ffi = 5.68

\n2 136', 72

-- ' Since,thecalculatedvalueofz=5.68isgreaterthanthccriticalJalueof z=2.58(lyolevel),thenullhypothesisisr$ected.


Hence, lhere is a sigrificant differcnce in thp ptrol consumption of the two Rpes of cars.
h,crcfore, tlrc
'' Illustrrtion 19. The Educational Testing Service conducted a study to investigat€ difference betwecn the scorcs ofmale
lulrd8.
iom anothcr
and female students on the Sclrolastic Aptituae-fest. lt.
r,uay identified a random sample of 562 female and g52 male students
who had achieved the same high score on the'mathematics portiori of
the test. That is, the female and male stidents were v,iewed
roduction at as having similarly high abilities in mathematics. The verbal
t, DU,2002)
scores for the two sampler ... gi".n-' *
: itcms in the
Femalestudents: i1 =
Ja7;sr=83;Malestudent : i2= 525;sr=lg
Do thc data support the.conclusion that given a population
-
similarly high mathematics abilities, the femali stud.ntr t"itl have a
of fernale rtuirns and a population of mate studens
with
significantly higher ver[af ability ? Tcst at a 5% tevet of
significance. What is your conclusion ? -
(4{BA' DU' oc' 2003)
Solution : Given : t, ;;;;, = 525,.r, = 83,:2 =lg, nr= 562, nr= g52
! Lct us take the null hpothesis be that thire is n6 signifidant ainerincc berween
malc and fematc verbal ability, r',e,,
Ho : pr - t{ 20 and H, : ttr-
15<0
Using 2=g x-r -x1 547 525
I

I t,}-,,,,, '(g3)'
+ 0s)2
ucstion, onc.
{rr 562 852
Small Sam lin Theo
INTRODUCTION
The techniques examined in earlier chapters (13, 14, l5) under the general headings of samplirg
of parameters and tests of hypothesis were based on a knowledge ofthe
distributions, estimation
sampling distribution of the sample statistic for large samples. We have discussed eartier,
original population ls normally distributed, all sampling distributions of the mean shall bc
distributed regardless of the samp,le stze (central limit theorem). If the original population ts
distributed and the standard deviation of.the population is unknown (and thereforc, has to be ostimatcd
fi"om a sample), the sampl ing distribution of the mean derived from large sampl, es will also. be normally
distributed, but II the sample size rs small ( say 30, or less) then the sample statistic wil follow
t distribution. Problems of estimation ind tests ofhypothesis for arge samp,les wcre devel opcd tn t
chapters and this chapter extends these concepts for small samples, when the undertying sampling
tribution of the mean follows a Student's t-distribution.
The Student's ,.distribution obtained by W.S. Gosset was published under the"pen name of "Stu
dent" in the year 1908. It is reported that Gosset was a statisticihn for a brewery and that the manage E
ment did not want him to publish his scholarly theoietical work un(er his ttal name and bring shame to
his employer, Consequently, he selected the pen name of Student. rq
slnl
As a matter of fact, procedures. of statistical inference for small samples are the same as thosc
presented in preceding two chapters. 'The study of statistical inference with thc small samples is called
,mall sampling thcory or exact sampling theory'. In this chapter, we shall dijcuss in detail the "/' and
"f" distributions. These two distributions are defiired in terms of number of degrees of freedom. lt is
appropriate at this stage to clarify this concept,
' Degrees af freedom. The number of degrees of freedom, usually denoted by the Grcek simbol v
(read as nu) can be interpreted as the number of useful items of information generated by a sample of
liven size with of a givcn population parameter. Thus, a sample of size I
respect to the estimation
generites one piece of useful information if one is estimating the population mean, but none, if onc is
estimating the population variance. In ordcr to know about the variance, one need at least a samplc of
size n ) 2. The number of degrees of freedom, in general, is the total number of observations minus the
number ofindependent constraints inlposed on the observations. !

, Xt Xzt X, has feur terms. We can arbitrarily assign values to any


Suppose the expression EX= + quitr
.three of these four values (for example,lS = X, + 2 + 8) but the,valuc of thc fourth is automatically dist
determined.(for example, Xr = 5). distr
In this example, there are 3 degrees of freedom. If n is the number of observations and & is the
number of independent constants (the number of constants that have to be estimated from the original grvc
data) then r- ,t is the number of degrees of freedom. tu
I s'
I u
['
tr:.
Small 519

tf we consider sample of size n drawn from a normal (or approximately


normal) population with
mean p and if foreach samptc we compute r, usingthe sample *.rn r and
samplc standard deviations,
*tc distibution for t can be obtaincd. The probability density
funption of ttre r-distribution is given by

lQ) = +
-6(l(o
g Theory r*4v
Io is a constant dcpending on n such that the total area undq the
curve is one.
y=n
- I is callcd the nrimber of degrecs of frccdom.
headings of sampling Prcpertlel of t0lrtrlbuUon
edge ofthe undcrlying ( ) Thc r'distibution ranges from - o to o just as 'docs a normal distribution. .
t
;ed earlier, that.if the
(2) Thc t-dishibution like the standard normal distribution
is bill-shaped and symmctricalaround
an shatl bc normally mean zen.
opulation is normally Thcshapcs ofthc t'disribution ghangel as the number ofdegreesof
r, has to be estimatcd ^ ..Q)
for difercnt dcgrtcs of frcedom, the r-distib-ution has a
frecdom changes. Thercfore,
amili oiiiistributions. Hence, thc dcgrecs of
will also.bc normally focdom v is a paramctcr of the r-distribution
tatistic will follow a (4) Thc variancc ofthe t-dishibution is always greater
than one and is dcfined onty when v ) 3 and
ieveloped in prcvious is givcn as
erlying sampling dis. v
Varlrlt v-2
he pen name of "Stu- (5) The r.disribution is moro of platykurtic (lcss peakcd
8t the centrc and highcr in ails) than the
and that the manage- nor.mrl dishibution.
re and bring shame to (6) Thc t-distribution has a greatcr dispcrsion thdm thc
largcr, the i-distribution approachcs thc normal form.
standard normal distibution.,As r gcts
Whon n is as largc as 30,
the differcnce is vcry
re the same as those small' Rehtion betwccn thc t'disribution and standard normal
aistriuution is shown in thc diagnm.
nall samples is called NORMAL
i in detail the "r" and DISTHIBUTION

rees offreedom, lt iis


I DFTRBUTION
nr t5
,the Greek symbol v I DISTRIBUTbN
.l ttr6
rated by a samplc of
:
b a sample of sizp I
n, but none, if onc is
I at least a sample of I
rservations minus the Sbndlrd Normd Disnibution compqcd with ditibution whcn r-j and r - t5
The t-disribution.hss diffcrcnl dcpcnding on the sizc of the samplo. whcn thc samplc is
assign values to any quitc.small, for examplc, if n cqulstapca
fivc, the'heighftrfie r.aiitriuution ir lr,rrt i O* the normat
urth is automatically disribution and the taits are wider. As n nears 30,-howcvcr,
distribution in shape.
th;;Jilil;;i"r;;;;;;,
the normar

rvations and * is the Thc l'trbla Thc rablc qivcn at thc cnd of the book is the probability
ted from the original
. intcgral of rdiitribution, It
qtvc: ovcr a rulgc of vraluos of v at diffcrcnt lcvels of signincanca ny solecting-a prrti.rf*
degrecs of
frccdom and lcvcl of significance, we dcterminc the,tab-ular r.rur
f ori we Jiiiuiifi . iiiihypithcsis,
520 Business Statistics

and if our computed / is greater than the tabular /, we reject the null hypothesis ; if our computed t is I llustrr
66
smaller than the tabular ,, we accept the null hlryothesis' Test. wl
.
Applications of t-disribution. The following are some important applications ofthe tdistributim : Solutior
(l) Test of Hypothesis about the population mean. -: ' ration is u
(2) Test of Hypothesis about the difference between two means.
(3) Test of hypothesis about the difference between two means with dependent samples.
(4) Test of hypothesis about coefticient'of conelation. i andr

(1) Test of Hypothesis about the Population Mean (o unknown and sample size is small)'
When the population distribution is normal and standard deviation o is unknown then the "l" statistic
defined as' :
r=
"t= slrln
*
.is

follows the Student's t-distribution with (z - l) d.f.


where f =samplemean
P = hYPothesised PoPulation mean
z = sample size
s is the standard deviation of the sample calculated by the formula
Z(x,-l)2
s= n-l
The null hypothesis to be tested is whether there is a significant difference betweeq and p. f
If the calculated value of , exceeds the table value of I at a specifibd level of sigrificance, the null
f
hypothesis is rejected and the difference between and p is regarded significant. If the calculated value
of t is less than the table value, the difference between x- and p is not considered to be sigrrificant. It may
be noted that this test is based on n 1 degees of
j freedom.
Confidence Interval for the Population Mern. When sampling is from a normally distributed
population with unknown o, th1 100 (f cent confidence intemal for the population is given by
-:)p.r Therefc
. x *tolz,vslaln
*"' The tab
than tlre tablr
r*r*r.,u2 t < trr.J= 1- o
(2) Te
"r.
x-lL In test
Pr.l-trr.u< = l- o
slrln 't*r.) populatior
Hence, 100 (l-a)% confidence interval is given by (a) the eai
i lJi < P< i + r*." sl Ji
))
ol- * or-.
-t*r," s
lllusirtion t. ren oiiiins are taken at random from an automatic filling machine. The mean weight of the tins is 15.8 kg (a) Cr
and standard deviation is 0.50 kg. Docs the samplc mean differ significantly from thc intcnded
wcigh'o' *f,)Un,
'U DU, ggg\ I tween the
unknown)
Solution. Let the null hypothesis be that the sample nrean weight is ngt different from'the intended wcight.
Giventhat ,r= 10, I =15.8,s=.0'50,P=16
Using the ,-test, we have
-,-L 0'2
,=
,= = -!-5'8-10
=- =-1.266
;T "r 6m
The table wlue of I for 9 d.f. at 5% level of significance is 2.26. Thc computed
value of t is smaller than the table value of
,. rh.r.frr., ,t" diff"r.n.. i, insignificant and the nill hypothesis is accepted. Hpncc thc diflcrencc betwecn sample mcan weight * E(r.
and the intended weight is insignificant. il

L\
Small Theory 5?l

2. prices of shares (in Rs.) of a coripany on thc differcnt days in a month were found o bc
:
is ; if our comPuted t is llustrrtion
66. 65. 69. 70. 69. 7 I , 70, 63' 64 and 68
Test. rvhether thc nrean price of the shares in the month is 65' , ( MBA' Delhi Univ" 2A01\

ons ofthe tdistribution and the population sundard


Sotution. Null hypothesis H0 : lr = 65, Assuming thc population to be normally distributcd
Jeviation is unknown. the appropriatc test statistic to bc uscd is
r-p
t=-
ndent samples. s/ln
I and s can be comPuted from the samplc values ftom thc following tablc
:

I sample size is small). x (-r - 6J) d2'


rwn then the "r" statistic d

66
65
69
70
0
4
I

5
I
0
l6
25
t
69 4 l6
7l 6 36
70 5 25
63 -z 4
64 -t I
68 3 9

Zd=25 A2 =nia
ae between f
and P.
rl of sigrificance, the null i=A+Y=6s+ff-67.5
nt. If the calculated value r- (tr)2 7t2 _ (ra)' Iffi
d to be sigrrificant. It maY ,.-{r(;?- zx2
a-l --:4=
a(r - l) n-l a(n - l) I s loxg

,m a normilly distributed = JmiE fiIf = = 2.8

he population is given bY f-u 67.5-65 2.5


rherefore, ,= ,tG = it:lii = 6ft = z'6r'
isgrceu
Thetablevaluelforgdegrdesqffreedomat5%lcvclofsignilicanccisl26.-Since-thccomputedvalucof,=2'81
ttr.n tir.iuUi. *. ttj*iif* nutt hypott oit. Hcttcc, thc mcan price ofthe sharcs in the montlr is not 65'
""iri.
(2) Test of Hypothesis about the Difference between llvo Means'
distnbutui
in testing a hypothesis conceming the difference between the means of trn'o normally
can be used in two t1ryes of cascs
popuir,iont i,fr.; tfr" population ratiances are unknown, the t-test
qade in which variances are equal, l'.e., o,2 = or2, (b) the case in which variances are
not equal, ic
iri,t.
6
12
* 6.,2
of equal variances. Let ttre null hlpottresis be ttrat there is no significant diflerence
ran weight of the tins is I 5.8 kg be-
(rz) Case
weight of l6 kg ? :
tween the means of the two populations, i.e., Ho: 1t, [tz. When'the
population variances (6qb
(MBA, DU, 1999)
unknown) are equal then the appropriate tcst statistic to be used is
: intendcd weight.

l=
xt-xz xt'xz 6
,lffi
1l s.
-+-
t\ n2
is smaller than the tablc value of
rrcc betwcen samPle mcan weight rE(x-r)2 _ E(x2+t2-2.r;)
=
o'-ii' = rt, - (rr)2
-;l n-l n-l a- I a(n- l)
,.

52! Business Statistics

will follow ldistribution with (n 2) d.f. wherc fr and i2 arc sarnplc rneailt of sarnple I ofsizc n
n2
re thepopulation means, and is "pooled',t cstimate ofthe
sanrple 2 of slze n, respectively; p, and pq
mon population standard deviation obtaincd by pooling the data from both *re samples as gven bclow
2+ 2
,\- ,n2 -
s=
- nr+n2-2

where J,'=
E( trr -rr )
i and srl=
z('' -;')'
ilz -1, & ca;c
Thbreforc, alternatively s can be from : r. 'Lcr.tt d:if
Illsrtrrt
r(.r1 -;1'[ + E(xa -rr\' :* i;iio*'i-og
s= rhL.9si!41'1,
nrtnr- 2
lf the computed valuc of,t is less than the table value of I at s specified lcvcl of significance, thc null No. ofS
hypothesis is accepted and the diffcrence betwccn thc trro means is regarded as insigrificgnt. If 6c Avcng!
SEoded
conrputed value of , is more than thc table valuc of ,, the null hlpothesis is rcjectcd and the diffcrcncc
between thc sample'mcans is regardcd as siErificant
Sddo.
rplying t+:r
(D) Csse of unequrl vmhncer. When the population varianpes are not cqual. i,e,, o,2* gr3, we usc
the unbiased estimators {,
and sr2 to replace o,2 and orr. In this casc, the diffrculty arises because thc
sampling distribution has large variability than the pgpulation variability. The statistic :

x -x - (lt' -.Fr)
. t= whcre
sllnr+srzln,
may not strictly follow t-distribution but may bc approximatcd by t-disribution with a modifred value
for the degrees of freedom given by'
!
slt + 2
ln, r-l
d,f - ( si/n, . ( s'rlnr)'
)' +-
nr-l trr-l Thc trbk
lllurtrrtlon t. 1\ro dilfcrent typcr of drujp I rrd I wcnc triod on ccruin prticnE for imrcuing weight 5 pcnom wore
given drug A tttdT persons yerc glvcn drug '8'. Tlrc incrcuQ in wcight (in poundr) ir girrn bclow hypothcsis hol
Drug A: 8 12 13 9 3
DrugSr l0 E 12 15 6 I ll Confldent
,Do thc two drugs diffcr rignificrntly with rcarrd o thsir cffect in incrcuing wcight ? '
Tlvo sa
Solutlon. Null hypotheris Hoi ltf tr2. le, thcrc ir m rignificrnt diffcrcncc'in thc cllicrcy of thc trro drugr. Applyiry t-
populations
tcsl:

,-+rm given by

CALCULATTON FOR r,12 AND,'

Jl '(r,-&) (x,-r,)2 Xq (.tn - Ir ) (.r,- & )'z


(3) Tes
8 -t I t0 .0 0
the previou
t2 +3 9 t -2 4
indepcnden
l3 +4 t6 l2 +2 4
I 0 0 t5 +5 25 paired so fl
3 -6 36 6 4 t6 second sam
8 -2 4
ll +l I
,' test, it is nt

Ex,-45 E0,-r-t)-0 E(r.,-r-r)2-62 Er.-70 E(r"-.r:l).0 E(r.-rif-54 I This is also l

Llr-
_Small Sampling Theory 52tt
isanple I ofsrze n. and
d" esrirnat of the dom-
Exr 45
- ='jii=T= xr^ 70
'xr 9, iz-E, =7-lo
ples as givanbelow:
E(rr-ir f +2@2- t-a i2
t- 162+s4 Il lr
' {}1;
nt+ n2-- {16'' - 3.406
Thcrchrc,
'-,
t- it.-iz f-
rl,h=fifrffi=-#xr.zoE --s.5
rhc-carcuratcd varw oflil rl'
significant difrcrcncc in the crficacy 6:;::ii;ii.j,ii,jil',,t*li*
oirh,il;;;i, #:::i;,i;:,';
tr,. rriiri"liiffi;iil;rshr.
concrudc that rhcrc is no

,'..il1fi[1:il1#HiTiXi#1'gi.T#,#'ff.[ir#iffifl
salesmcn. H*mtu..":,1*.:$.:,##,f.Tfr
significance, the null
salcs I
insignificgnt. If
No. of
*rt Rs.) t B

d and the differcnce


0rc lrorgo (in lakrr
Rs,)
,;3 #
Standard dcviation (in takh
Null hlpothesis {,
. 20 Zs
: Er - lr, Le,, that is no rignificant diftercn; in rhc avcragc *tcs bctwccn thc two
i,e., or2*or:, we use sllcsmal
f arises becausc the ^rrrri,lltlliSi
ttic :
,=.;,-A.
W
rr2 +
whcrc ,-
h a modifitd value
n1 + \'2
- -ry-23.3e
190-205 fiom r{

veight, 5 Srdnonr wcrc Thc tablc valuc of t rt 5% lcvcl of


sigrificurcc fo r 26 dg i.c-2,056.Thc calculatcil val,c of l is lcss, than thc tabtc vduc. Thc
hypothcsis holds truc' Hcncc, wc concludi
tt"t tt.rc is no riirrn.*t Jrnili..ffio
thc two satcsmcn. ratcs bctwcen
"r.rg.
confldence rntervat for the Drfference between
the T\ro iieanr
wo drugr. Applying r. Two samples ofsi zesnrlndn.uc randomlySaillgqgnaentlydrawn
from two normallydistributed
with unknowns'but eqtal variancei. Tho 100 (t-crpciccnt
confidcnce intervai for pr-4is
liJ*tilt*
(t, -rr) *,t*,us/
lt
(xr- rz)2
-+
t\ fi2

---___0 (3) Test of hypotherls rbout the dllference


betwcen two mcans wlth dependent srmples*. In
the previous section, we assumed that
4
4
thi'tno random
indepcndent' In many.practical situationr,
t.ipi., dla*n from thc two poputations werc
25 paired so that each observation
tt ir mi noi il;;:'ih. samples *e dependenr i."they are
l6 in onc sampte ir'.r*rirrcJiith ,o*, particutar obscnration in tbp
4 second sample. Because th,-, property,
the tcst we goin! to uro will be cailedpaired ttest, t-
r test, it is necessary that thc
?f
E
I
(rr- ir )2- 5l I
rrrpdi;
obslrvations in thc two 'rc ;t#il-il diffhit-
This is also known as thc paircd l+csr.
ser9@
nro -m! :
have the sarng ngrnFr of units' Instead of obtaining
pairs. If two samples are dependent, theJ Inust associated with a
random samples, we can get one,.nao*,1ripi.,iilg *ama*"
"*'rements :lllI
pair will be related t" .iJr, other. This mna i'ia
pJr.,
arises.in cases such as before and after
type
l!E5l!m.!trS
supi6G-1fr-a1nuo traini-ng
some other cJterion. r*cm
experiment or when our"ru"ti*, are matchei;;;ir. ;.r!t
"; trainees divided into two
methods arc to be."r;;;;;it, uuri, or"n.i.g. scores by management
exierimental results are available' we test
S.c€tr
equal size classes, #;;gh, by each
.ttr,,J]wnen !rt
i[rir..n, tt't twomethods are equal' i'e'' Ho: [t' = p2' The !.h
thc null hypothesis tf,"t "rrori"ir-O '"itt' .rElltlg F

I=
s
- of the differences is given by d =Edln'
s
follows t-distribution with (n - l) d.f.where 4 "un
=
and is ven
is the standard deviationof the differences ffr
E(d:a)z U2 -n @)' Ed2 Q, d\2
s= n-l =
, fi'l n-l n(n- l) t
in the samPles. If the computed value
oft (at a specified
and n is the number of Paired observations tab,le value of ,, our
degrees of freedom) is less than the
level of sign iffcance for a given numberof
null hypothesis is accepted, otherwise rejected. in m officc. Thcirpcrformancp was noted
by giving atcsrrnd
in an ofliccr cadrc lw'
Illurtrrtion Pcrons wcre appointid
5. Ten and marks rrprc r€oordcd out oI
of I fi). Thcy wctc givcn 4 monthstraining and a test was hcld
thc marks wcre rccordcd out
BCDE F GHIJ
Enployee A
80 76 92 50 70 56 74 56 70 56
Befue training
Afurffaining : .84 "'70 96 80 ?0 52 84 72 72 50
that thc cmPloYccs havc bcncfitcd bY ttrc raining?
bc concludcd
By applyrng the r-tcst, ean it
cmPloYccs havc not bcncfitcd bY thc
training' Applying r-tcst'
Solution. [.ct rs takc thc null hYPothcsii thatthc
Atler (lst-ZnO
Employees Belore
2nd d e
lsl
A 80 84 4 l6
+6 36
76 70
B t6
96 -4
c 92
-20 400
60 80
D 0 0
70 70
E +4 ,16
56 52 For v.
F r00
74 84 -10 thc null hyp
G 256
72 rl6
H 56
72 -2.
4 toal
I 70
50 +d 36 difrerens
J 56
Zd =-40 El = t80 confiderx
n=t0
l= BE, *,,. e -+- # =-o
(4) r
Cesc
'=l-E)t Hoz p=l
Here
=
',- lr'*,ilf-=e=8.e44
tion coefi
there is nr
-4Jio'- = -t.414
' 8.944 -4x3.162
E.944
know the
v= l0- I =9, ForY=9,to.or=2'62' statistic tr

g.
Instead of obAiningtwo
ments associated with a
|rce}crrIacdva.lueislT:3i'.h:."blcvaluc.Thcnullhpothcsh,",,.*"*ffi.'l]
mL-ylts hrr€mt bencfit€d by the training.
as and after I.lrrdN 5' Tcn workers *try si"* a training progpanrmc with a vicrr
uppose that two @e'* Tbc rcsure of the time and motion to shoit,en thcir asscmbty time for a ccrtain
studics tf{;t.-ft";;;;ffi;.grrrrr,r*
rinees divided into two n{r*tr : I Z 3 4 S 6 7 g--,9 l0 are givcn bcrow:
are available, we test '-*strlv(inmnts) !l' It 20. t7 t6 14 zt is t3
Scced snrdy (in mnts) ': 14 16 2l t0 l j tE 19. 16 14 20 22
l, i.e., Ho: Fr = [rr. The fu ttr basis of this data, can it bc concluded that the training p,og,,ir.
* us take thc null hypothesis that the training prog;i. has no,iisi.ip.a
st orrenca thc average asscrnbty timc?
in rcducing the averagc asscmbry rimc.
*rffH,
.- d',ln
given by 7 = 1dln, s I
CALCTJLANONS FOR 7 ENOr
Qil2 Worker lst study Znd study , (lst-2nd1
(n-l) d
d2

lui of r (at a specified I t5 t4 +l


2 l8 I
e table value of t, our t6 +2
3 20 4
4 t7
2t
l0
-l I
s noted by giving a tcst
+7 49
urd 5 t6
s wcre rscordcd out t5 +l
of l0O. 6 l4 I
7 2l
IE 4 l6
r9
8 l9 !2 4
t6 +3
9 r3 9
l4 -t I
t0 22
Applying rhst: 20 +2 4

I,d= 12 Zdz =N
t -Ir't2
6l =- nl0
=-=l?
l6
36
'u2
t6 J= _ (r2)2 90
400 n-l 9 l0x9 = Jid;l; = 16.-4' = 2.Ee8
,t6
0
,= a,{; = ,1u#"= ,.?:rri* = r.3oe.
r00 For v = 9' ore tab" i. z.zoi. sir.. rr,, computcd varuc of r is rcss than
256 thc "*"I-1^r^r:.!:,lrr,r,rr'.-t
nullhlpothcsis is acccptcd' Hencc, thc traininfrrogrammc
hrs not shortcncd thc arrcragc asscmbty timc.
rhe rabrc varuc,
4
36 conlidence Interval for the Mcrn of the Differcnce. when the population for the mean
differences is normally distributed wittr unlihown of
Ed variance ro, oupria"nt samples, a 100 (l
= 880 confidence interval is given by - a) per cent
7 t 6.r," s/J; .

(4) Test of Hypothesis about Coeflicicnt


of Corrtlation
case I : Testing the hypothesis when the population
coeflicient v' !e'
"D of correlation
I rr'rrull equals
Equ'r zero, La,
Ho z p = 0.
Here the null hypothesis is that there is no.conerationinthe
poputation, i.e., Hr: = 0. The popuri-
tion coefficient of correlation p measures the B
degree.of rrl.iir".liibetwecn the variabtes.
there is no statistical relationship bet*een when p = 0,
the variables. In order to'test this hypothesis,
know the sample coefticient oi conelation it is necessary to
r (which is the best estimate of p). The appropriate
statistic to be used here is given by ! o
test
526 Business Statistics

*I
r=*=rJfi urmlilllmu
,11 - r2
which follows r-distribution with n _ 2 llHm::m
degrees of freedom.
If the computed value of r is greater than the. ffiiil m4rl
table value of r, the nu[ hypothesis is
indicates that sample data providei rejected which
sufficient evidence to inaic#tiiittr p r* v'
0. nens''
Hence,
P it
lr can 0 conctuded
be
that there is a linear rerationship between
the variabtes. ,
Illustrrtion 7' In a study of the rclationship bctwcen
expenditurc (,t) and annual satcs votumc (l),
yiel&d the cocfficient of corretation r a samplc of t0 firms
= 0.93. can we conclude on thc bast of this data
solution'
thatxand l/are lincarly rclated ?
The nult hypothcsis is f/o : p = 0, i.e', therc is no rclationship bctmccn
two variables. using the l-tcst
OO?

,,ll_r. - = #Jio-tz
*Jn_2
Ll-
{t_10.$y2
= 9r;6- - - ,^,
0.93x-2'82E
-- /'ur'
Thcdegreesoffrcedom orv=n-2= g=;." ll,o- ffirl illt
The able value of t at 5% tevel of significance
for I d.f is 2.306. Sincc the computed valuc
l1l:-;l
valuc' the null hypothesis is rejected' is much grcatcr thanthe tabre
Henie, it may be concluded that x and I/ are
lincarty rctated.
2 : rbsting the hypothesis when the population
Case
rv!"
r -r ---'-v" coeflicient of correlation equals some -r:q itE
other vrluc than zero, ie., Ho I p ps.
= h*mrulton
l

tnthis case when p *0, thetest based on rdistribution 'm]lmtu-"I[:dn


t
wilt not be appropriate. In.testingthe hypothesis,
the use of Fisher'sz-transformation wilt
be appticable. Here, r is transformed into
z by
z=
l l+r
Here' loq is a' natural logarithm. common
rto&J:; ,

logarithms may be,shifted to natural logarithms


by

log, X= 2.g026 tog,oX


where X is a positive integer.
I
Since (2.306) = r.r5r3, the transformation
; formura rhay be used as :

' JI
z= l.l5l3 log.-
Iior", i, can be shown thatz isnormally o*fi;o
with mean

logo
l+p
-p
2
and standard deviation
:,m!! t 51
rFqF, d@.!
- ,,ln_.3
lhtr
::wryr$hesged rtb
Therefore, to test the null hypothesis that p Sohil. nn
= po, dhe test statistic would bb :

?-'-"
o-- } lrrrrhm
which follows approximatdly the standard
'
ple size is large. The approximation
normal distribution. This test is more appropriate
if sam-
is reasonably good if the sample size is at least
10.
528 Business Statistics

The distribution ofz has a standard deviation

tl
-=-=- =-=
4.123
0.2425 - ":
:3;=:1! 'i'
Therefore, the statistic is
llln

- r.393E- r.5890
'= G- =-o'80'
From thc tablc of areas for thc notmal curvc, we find that in about 20 per ccnt times we may expect a dillerencc as largr c
largo than this. This hypothcsis that r = 0.884 can bc tcjected at a low levcl ofconfidence.

The F-Dlstrlbution
The Fdistribution is named in honour of R.A. Fisher Wholrs t studied it in l924.This disributim (4) rh
is usually defined in terms of the ratio of the variances of two normatty distributed populations. Tlr
(5) Thc
quantity
:-.rrespondj
s /orl
12 :snominato
sr2 /or2
r

xl wiere the q
where s,2= is the unbiased estimator ofb,2 and sr2 = is the unbiased estimator lower tail F
\-l nz -l
of or2. Tesffng of
If o,2= or2, then the statistic
The test

r
2
sl selected ran(
F=
follows F-distribution with n, - I and n, - I degrees of freedom.
.The F-distribution sometimes is also called Yariance Ratio dishibution which can be seen from the which follon
definition. The F-distribution depends on the degrees of freedom v, for the numerator and v, for the variance in tl
denominator. Therefore, the parameters forF-distribution are v, and vr. For different vatues of and v, equal to or gr
we shall have different distributibns.
{ Ifthe co
The probability density function of altemate hyp
Fdistibution is given by
F"rrz-t Illustntio
f (F)= Yo 0(FSo charactcristics h
0f225 and a san
t+fL v, +vr)/2 grcaterthan the r
v2
1 Solutiou lr
where Io is a constant depending on the values v, and v, such that the area under the curve is unity. A bc used here is
typical F-distribution is given as below :
Some of the important properties of Fdistribution are given below :

(D where

(%,8)
' Thctablevr
the hblc value of

Confidence Il
A 100 (t-
o populations is
Small SamplingTheory 52e

The f'-distribution is positively skewed and its skewness decreases with increase in v, and vr.
I I The value of Fmust always be positive or zero since variances are squares and can never assume
':*c.:\'e values. Its value will always lie benreen 0 and p.
3) The mean and variance of the Fdistibution are

Mean = +,forvr>2
-v2-/
a diflerence as
2vr2 (v, + v, -2)
Variance = , forvr>4.
v{v2-2) (r, -a)
4. This i4) The shape of the F-disbibution depends upon the number of degrees
of freedom.
d populations. by taking the rec[procal of Fvalues
t5) The areas in the left-hand side of the distibution can be found
x-sponding to the right-hand side, when the number ofdegrees of freedom in the numerator and in the
:e-ominator ire interchanged. This is atso known as reciprocal property and can be expressed as

I
F l- cr, vr v,
4, rr , r,
the
-:rere the symbols have their usual ,*uiiings. This property is of great help when we want to know
unbiased estimator : *er tail Fvalues from corresponding upper tail F values which are given in the Appendix'

Tesffng of Hypothesls for Equallty of ttflo Varlances


The test of equality of fwo population variadces is based on the variances in two independently
:.enected random samples drawn to,
t*o normil poputations. Under the null hypothesis Hot c.12 = or2,

F=
#reduces."= $.
n be seen from the *,hich follows F-disfibution with v, and vrdegrees of freedom. It is c-,o1yeng!t to3lqce larger sample
rtor and v, for the r.ariance in the numefator for computation.ip,ftort. If we do so, the ntio of thE samprc vafrana w{lIE
ralues ofv, and v, &Gl toor greater than one.
If the Joitprt"d value of Fexceeds the table value of 4 we rcject the null hypothesis, i'e., the
altemate hypothesi's is accepted.
Iltustrrtion of raw matcrials are undcr consideration by a company. Both sources seem to have similar
10. Two souroes
charactcristics buttt -rnpurry it notsure about their respective unifo-qityf'Ls9qlc
of l0 lots from source I yields a variancc
0iF<o "
of 225 md r samplc of I I lots from source
g yields a variancc of 200. Is ii likely that the variance of sourcc I is significantly
greater than the variancc ofsource I at c = 0.01 ?
sotutirin.Nullhypdhesisis,Ilo,c12=or2,ie.,thevariucesofsourcelandthatofsourceEarcsame.TheFstrttisticto
,'curve is unity. A bc uscd here is
S,2
F=*.fr'
where , rz = zis,and sr2 = 2gg

r=ffi =t.r
Thctablcvalueof Ffor v, = S r$a v, Jidd l%lcvclofsigrilicance is4.49. Sincethecomputedvalueoff issmallerthan
thc table value of E the null tfiUrcsis isicccptcd. Hcnce, thc population variurces of the two populations are samc'

Confidence Interval for the Ratio of T\vo Vrriences


A 100 (l-a) per cent confidence interval for the ratio of the variances of two normally distributed
populations is given by
530 BueinessStatistics

stz I szz
2
For y - 20, !^ n. = 2,086. '
:ience, thcre is siliihcanr diffc

Illuctntion 3. Ten accour


wherc ttre symbols below:
and 4 arc
MTSCELLANEOUS ILLUSTRATIONS s, Accountants
Illuctntior ll. Thc ninc items of a srmple had thc following valucs : lst tcst
45, 47, 50, 52, 4t,41,49, 53, 50 4th
Thc mcul rs 49 ind the sum ofsquarcs ofdcviations trkcn from m€tn Can this samplc agrdcd E Docs ttre score from test
population having 47 as mcrn ? Also obtain 9570 and 99olo confidencc limits ofthe population mcan. (MBA, ha th. Solution, Let us take tlp
Solution. Thc null hypothcsis Ha 47, ttr€ population mctn r3 arg gvcn thrt
r2
49, (.r , 52, and 47, n=9, Applying ,-tcst
i-p
l=-
sllrt
S. No.
whcrE

Substituting thc valucs, we havc 1

to-l?
. - t- ffi=rtt
2
3

Thctablcvaluc of tfrrtd$at5Tolcvctofsigrifioanceis2.3l.sinccthscomputcdvatucisslightlygrtatcr'trL* 4
valrp, thc null h1p<rthcsis is rcjcctcd. Hcncc, thc samplcs are not dnwn frorr thc populatiot having 4? ts mam. 5
95% confidencc intcrval of the population mcan is gven by :

.
' ^:trl,!rj
= +g*T =49 t196=47.Mto50'95
confidcnce interval of the population mean is given by
99olo
t-
. ,

= 49 **3.36x2.55
J
- 49 *2.t6' 46.14 to 51.85.

Ilturtretiol 12. A company is intcrcsted in knowing if rhcrc is a sigrificant diffcmcnce in thc avtragc salry rccciYtd ll
forpnrcn in two divisions. Accordingly, samplc of 12 forcmcn in tlrc first divisiur and l0 foremsr in thc sccolrd dlvirio re
schctcd at random. Based upon cxpcriencc, forcrnen's salarics arc known to bc approximatcly normally distributo4 rnd gtffi
dcviations are about the same
Firstdivision Seconddivision
For
Samplesize t2 l0
r alu
Averagc wcckly satary of foremcn(Rs.) 1050 980 The calculated
Stadard deviation of salaries (Rs.) 68 74 show an imProvement--
Illustrrtion l4' tn
Solution. L,et rs takc the null hypothesis that the avcrage salary rcccivcd by forcrncn in thc two divisions &es not diffcr deviarioos
squares of the
signifi cantly. Applying Ftest:

,= _
ir-i, @- the same normal PoPua
Solution' l-ct ts
u
,lnr*rr,
(nr -l)sr2 +(A -lh2
t= t4-2
n1+

[ffi
t-
l2 Izo = 70.76

70
2.34*2.31 ForY= 14' loos'
70.76 bamc
the difrer'mcc
normal PoPulation'
For v - 20' tn n' = 2'086' The calculated value
of,t is greater than the table value. The null
Hencc, therc is sifiihcant difference in ttrc sirary recii".d by for.me, hypothesis docs not hqld good.
in thc two divisions.
were given intensive coaching and four
, *rT'#?,H tes,. *..".oniu.,ed in a month. Thc scores of tests
l,ltJ,tttounttrits
S. No. of Accountants
I
) J 5 6.7
in lst tcst
8 9 l0
50 42 5t 60 4t 70 55 62 38
d rs trkcn from the
in 4th tcst 62 40. 6t 68 5t 64 63 72 50
hlhiUniu,2Nl2) Docs thc score from test I
to test 4 show an improvement ? Test at 5Zo levet
of significance.
Solution. Lct us take thc null hypothesis that
there is no improvement from test I to 4, Applying r+est:

S. ilo. I st test
42

t44
4
racr'ttun thctablc r00
EIII. r00
64
4l
t00
70
36
55
64
72
100
38
t44

Hz =856
_2d72
d =-=-=7)
nl0
salary rccciwC bV
econd dlvision rrc
J= rl
EV:,,7
' ' - 856- 1856LrE.4
xrto4lndstadud '
I n-l 9 : {-s = 6t25
7.2J6 7.2 x 5.la
= =7,11
6.125 6.125
For v=9. lo.or=l'83
The calcutated value ofr is greater than tabte value. The null hypothesis is rejected.
Hence, the scores from test I to.test 4
show an improvement.
ms des not differ Illustrrtion 14' The means oftwo random samples ofsizes
9 and 7 are 196.42 and I9g.g2 respectively. Thc sum ofthe
squares ofthe deviations from the mean a re 26.94 and i 8.73 respectively. can the sample be considered to have been
the same normal population? drawn from

sotutioo' tiiit ot" o.


,=\*ffi
hypothesis that the samples are drawn from the sarne norma
"urr , *rrrff3o!.'"iJlirff:;r::::,

-_ /r!I - ,r)r!ltz ;rl, ={--f


B.s4+tea =1|ri =,s, 6r,67

. tg6.42-tgg.82 @ x 1.984
'= --J.ei-l ir.? =
2.40
ffi =,.fr 4.76
= z'or
!
*.dl::*..'lg::"=frJii,;'*ii,Ti':fi?:.*;:lT.:I*5:,:*ill;#I[':::;,i;l:mffi"ffi*I:ffi
:ormal population.
l
::tr v = 2.086, The calculated value of
30'.rr.qr_= r is great€r than the table value,
rs significant differencc The null hlpothesis docs not hold good.
in the salary receivcd by foremen
in the two divisions.
,urtrrtlon 3, Ten accountaits werc given
intensive cdaching and four tests
w€re conducted in a month,
belowt Thc scores oftests

5 \0. of Accountants : I 2
89t0
\',ai$ in Ist test
srmplc bc rtgrrdcd tt takar
r,[r*s in 4th 62 6t 52 68 62s5. 38
r

in mcan. (MBI, Dcthi Uniu, hcs thc score from test I to test 4 show
40 5l 64 63 ?2 50
an improvcmcnt ? Test at 5% level
given that solution. Letus take the null hypothesis ofsigni ficance.
that theie is no improvement frbm
test I to 4.'Applying r-test:
E r-tcst
EJ;

S..Vo. I st test 4th tesi (4rh-l st) 42


I 50 62
I 2 42, t44
40
rlrr is slightly gcaer J 5I 4
rfran trc rfl 6l +10
r trEvtng 47 rs nrcan. 4 42 52 +10
r00
5 60 t00
68 +8
I 6 4t 5t +t0
64
7 70 t00
64 -6
8 55 36
63 +8
9 72 64
62
l0 t00
38 t0 144

Ed=72 U2=856
_2d72
al =-=--1)
m tlrc arqtgc salry rEGilcd nl0
rllnco m thc sc€nd dlviriorr td2 - n(d _ '--l------------
nrmallydistributod !d J=
n- -1--
1856-l0t.i.2\t
e
/SSO-Sra.l
-=6.125
l.zJto 7.2 x 3.162
-:i-s
l=
For
6.125 6.t25 - ''tz
y= 9,60,=1.83
of r is greater than table value' The null hypothesis
,rr" *tijl[}!:.0#lue is reiected. Hence, the scores from
test t to.test 4
6c tno divisiolu does not Illustretion l4' The m1111 oftwo random
samples ofsizes 9 andT are 196.42 andlgg-g2 respectively.
The sum ofthe
mean are 26'e4 andii'z:..rp..tir.rv. c*
IJffii::h|lfiffi}Tl.e iri.-'.'pre be consideredro_havebeen drawn rrom
solulion. lrt us take the nufl hypothesis thairhe
sampres are drawn from the same norma
i, _ i,
, rvrs'sr.vr..:;:{r::,;;::i:,
,rrrrff::.' ^PP,JI
. f rrr,
r ln1+n2

t0.76
+
n,
.x
_ 126.94+t8.73 lqs.ot
- V-e*?L = = t'sl
t96.42- t98.82
41+
V-,
2.40 x 1.984 4.76 -{
I.8t L8t =
2.63 T
Forv= to.or=2.145. Since the calcutated value
14,
of r is greatdrthan the:lablevalue, we reject
the diffcrence between: the the null hypothesis, Hence,
two:means is significant. 1'herefce,.the -samples
normal population.
cannot be said to have been drawn from the same
5ll2 Business Statistics

Illurtntlor lS. SElogft t6tl carricd out on srnrplcs of two yrrns spun to thc nnrc oount grvc thc foltoring rwuttr :

Sanptc hnplc funph


sizc ,ncan voiloncc
Yrrn,{ 4 52 12
YunD 9 42 SG

Thc $cnglhs uc oqrcssod in porrds. b thc difrccencc in mcm suargths d8nific$t ofrcd diffcrtnce in rhc tmra rtsroSll
-
of trc sourocs from wilich thc ranpicr rrc drawn ? (MBI, DU, 2N4
Solutbn. Lrt ru talc thc oull hpotlrcsis thrt thcrc is no sigrilicant difrcrpncc in thc mcan smarybr of 0tc hrc t)?cr d
yrrns. Applying t-tcst,

hr42+3156 ffi4
s,hcrq
{-s7 =
{T -7.T21
52-12 f1 ,9 l0 x 1.664
',= I l+c
7.224
t-
7.224 ?.303
v- rrt- nr-2- 4+9-2=ll. Forv= ll,to.or-2.20
+
Thc calculatcd vduc of r is morc than thc tablc valuc of t. Tte null hypothcris is rcjcctcd. Thc dilfcrcncc in thc
mcan strcngths of thc two typ€B of yarn is significant.
5r!
Illutntbr
16. To \rcrirywhcth€r a coursc in rccomdng inrprovcapcrfonnrncc, rgiririlutcetrrrs gwnto 12p&tioipc
bOttrbcfm md r$crthccounc.lhc origind md! rccordcd in dpbabctical ordcrofthcputictpos vucrc U,N,61,52"32,*,
70,41,67,72,53, !d
72. Aftcr thc corng thc msln ntrr in thc srm ordcr: 53, tt,69,57,16,39,73, 1t,73,7t CI,'rtd ?!.
Tcstwlp0prthc cormc was uscful ?
Solrtion. La us takc thc null hynothcsir 0ra thc coursc hrs not improvcd thc pcrfommcc of the prrticipanu. Appl$o3
,-tcag
d!r',
, =
CAIrct I/rinON FOR 7 Al.ID.r
Beforc Afru (2t*lst) d2
d
14 53 rl
e
40 3t 4
6l 69 6{
52 41 +5 u
32 46 r95
u 39 25
70 73 9
4l 48 19
67 13 35
12 74 a
53 60 49
72 7t 36

Ed-& zdz 51t


2d 60 -(
nl2 -
,cW-m=rry-s.un
*ln n32'---
,=ffi-i6i-3.1s
Fc v = I I , 6 n. - 2.201.The celculatJ vriuc of r is rnolr tbu thc tablc vdw of t. Thc null hypottrcsis ir rcftucd Hcrcc,
cousc has improrEtttrc pcrfonrancc of thc participrns.
Small SamPlingTheory 533

thc followingruul8: obtained :


for length of lifc' and thc following dalrwcre
Igrictnrion l?. Samnfcs ofT;liffcrcnttypasofbrlbt;;?Ed

Samplesize 8 1.

SampleMcan 1234 hrs' 1136 hrs'

locc in thc rncra tts€ttgthl i*pf, S.O.. 36 hrs' .


40 hrs'
MBA' Delhi univ" 2003)
, Is &c differcncc in mcan lifc of two tlpei of bulbs sigrificurt ?
(
(MDt, DU,2N2l t1'pcs tif bulbs'
that thcrc ,, lii ,li"in*t difrcrcncc in thc mcan life of tho rwo
rythr of drc trro typcr oi solution. Lct us takp rhc nu[ hypothcsis
Applyingt-tcsg
W
,= f7.-i^ i,,.,
(nr- l) ri +(n2-t);2r
.7.T21 nt+ n2- 7

(s- l) 362
8+7
+(1 -l) 4d
-2
*W-rr.,
, The diffcrcncc in.the there

tiworo 12prrricinrats
G 4{,.l10,51, j2,3i 4a.
1t,73, 71,60,.rrrd ?S.

PryriciP.nts. APplyrnt

Sanple II
Sanple I
) )2
n) (x -il )2
64
4 27 -8
20 -2 -2 4

il l6 4 36 33
o +1 49
26 +4 l5 0
o 0
4
n +5 E 35
9
61 +l t 32 -3
23 -l I
----E u n 0 0 34
3t +3 9
r96
It 4 l6 :l 49
a u +2 4 28
4l +6 36
9 +3 9 64
25 +8
19 9 43
l9 -3 30 -5 25
35 +2 4
37
1
t x (rr Ezf =l
49
36 -Tl0 , )2 = l2o =420

- I(;$-*#-13.333;
'12
z(x2-i2t2
- = !11
ll =2s.545
'22
nz-l
--_. p- ,*-*#-0'16?
Sinco nutncnts is grcarcr than dcnominator' tlrcreforc'
l irr{occ4 Harcc,
',- W=2.14 13.33
j

chi-s uare Test !l

INTRODUCTION Td:!-i&i:r
r" I *
In the previous chaper on smatl sampling theoqy,
it was necessary to niake certain assumptior*
about the ppputationsfrom which ttre samites'-wir.ti'r*ntin
mJny ortne statistical tests, we had to
sampres.camefrom normar-popurations. wr,ri
:'^lT:lllthe
necessaly to- qse procedqres that do not
tr,is ulr;rod;;;;ffi
bljirstifiea, it is
generally referied to as non-parametric
require that these ,.t. rrr.r.lri..our., *
conditions ur ,r;38 1s
methods.In this .h.tt% ;;ilr discuss the test
to thiscategory. :
12 which berongs
-;nrc4t
chi-square) test is based on
12 distribution which was first used by Karr
,...r;11;:T:fffi:as
:itergs 110
The Chhsquare Distribuilon
1sff:Eql'l
For large sampte size, the.sampling (probability) :s;'e*o€It
distribution of 12 can be ctosely approximated by
a continuous curve known as ttre cii-squ; -,&:eqLtrx
aistriuuiion-n .-pr"iiu,rity function of distribution
given by 12 is
--5,1iieilCr
.

F (X2) = ,7f27Q2)- t ,t2n :€3egf'l


where,
(

e = Z,ircZA
y = number ofdegrees offreedom
. c = a constant depending only on u. --.€re,

similar to the case of the r-distribution. Hince,fl')


r family of aistributioni on. fo..a.t value of v.
is
Ttrc
lmportant Properties of Chi-square Distribution spe{ified
0) X'distribution is a continuous probability distribution which has ihe vatue zero
at its lower
)etwee? 1

and extends to infinity in the positive direction. iuctuatio


Negative vatuer of 12 is not possibte.
F$1 ralue, 0x
uisen du
dt= I
Th€
afterc€{A
on ttle e4
Ina
r&arginal t
of all coll
0
10 12 number o
fThc valuc of12 can
nevcr be negativi since the diffdrences bctwccn r refes tr
the obscrved and
always squared 3x3abl
V *'

641 Businessstatistics
\

Sotving these inequalities ior d, w9 get ,ul@

#<d<W
1fi[

wtrich isJhe rcquired tOOlt - a) per cent confidence intcrvrl for d. Tur tm
Vrrlencc. In testing hypotheits about the vuiarp
f6ltqtt of Ilypothesls Conccrnlng s@
**Afy airtibuted population, the null hypothesis is flo : d = oo2 wherc oo2 is som! ipecificd 'wmoq.
il Imr ru
u
the population variance.
4
(n - l)sz
We know that 12 =
T
whcrc12 is computcd from arurdom samplc of size n'
Inr/fu

tm u0rt
,!lm

, ItrlflE5
lf X2<t?_on and12 >X?on,i.e.,whenthecomputedvatueo{f liesinthcrtjcctionregq : .,l:,]m.[E

reject the null hypothesis, otherwise wc accept ttre null hlpothesis. firis is shown in thc diagnm nB @j
below:
u') 3 :E[
rA* - r-lll

RA'EG{Iot{
REGION
BEJECNO{ -:
ACCEF/rA{CE
,i:.t -3 &tr
REGON
a12 (1- q) o,lz -- J--L

qfl
-an

{llurtrefion l. Wcighs m kilograms of 0 shipmcnB trc $tgr bclow


38, {0, 43, 53, 47, 13, 55, 4t, 52, 49.
thc lbotrc srrpleof 0
Cm WG say hat vananoe ofthc distribution ofqreight oiatt shipot nts ftom whiclt
drawn cqual to 20 squ8rc kilognm ?
w€ight rt 20 squrrc kilogn4
Solrtior. I.A thc null hpothcsis bc that the variurccpf thc distdbutioo ofchipmarc
Ho : d-20.
Wcight (in kg.)
x (x- x\ (x 7r
38 -9 il
49
40 -7
4
45 -2
+6 36,
53
0 0
47
-4 l6
43
+8 64 EJ
55
+t I
4t
+5 25 :mtsb
52
+2 4
49 ;"E$ifr
xx 470 r(x-.7)= 0 E(x- 7f =2so
Chi-Sguaretest 5f/

pothcsh about
thc variancc
bre oo2 is somc of
specified rrlue

i lies in thc rejectiorr


regoq
s is shown in
thc di4gram

x'

Attribute B
B Bz 83 .........8 .........,...,
rborrc ssrph of l0 $ipmcno,mt c
Al on orr""""'..o,"
/ oil ?:: ,a: Rl
rcightir20 rquarc.kilogrraq
i.c.,

\
"2
A3 oy
:
on 3:: si R,
ni
q) : : :
:t A
,a o,, o.^
t t
:
,t oB o o,c R.
\ .t
: ;
;
.:
AI oil o12 O13"" . .00...... ..dr" nt
Total c, c2 cr...... c N
c
computed

W- f )2-2to
5,{8 tsusinbss Statistics

_ & ci Rici
Er=i*t',JV=;i
To conduct the tes! same 12 is emptoyed as discusse8 earliet, i.e.,
For v -l
r2 IhG crk
tc ( o. - ,n) curing thc die
12= I I U

E.
or E Illurtnt
i=l /-l tj thcm nrrr giv
following tabl
will follow 12 disfiibution with v = (r- l) (c - l) degrees of freedom.
Drug
' While applying the test, the null hypothesis is that ttre two attributcs are independent lfthe cafouhtcd
Sugar pd
value of 12is less than the table value at a specified lcvel of significance, the null hypothesis holds truC Totol
i.e., the twp attributes arc independent. If calculated value of 12 is geater than ttre table valuc, the null Orthc b
hypothesis is rejdcrcd, r'.e., the two attributes are associated. '
,

Ill$tmtior 2. A samplc of 2fi) pcrsons nith a prrtiorlu discasc was sctcctcd. Out ofthcse, 100 wcrc givcn a drug ud thc Solutli
othcrs wirc not givcn any dnrg. The results arc as follorrc : cold is concan
Sincc it i
N*mber ofPersons
frequcncics a
Drug No Drug htal
Expcctq
Curcd 65 55 t20
Notcured 35 45 t0.
.Et
Toral loo ' loo 200
Thc tlbl
Test, whether thc drug is cffcctivc or not. .|
Solution. lrt us takc thc null hypothcsis that 0re drug is not cffcctivc in curing thc diseasc. Applying 1' tcst :

Thc cxpcctcdr scll frequencica are computcd as follows :

Er=E =# =60; E12=Y=#=uo


E2,=T= #=oo, ,rr=#= t*9=oo Arugia
Thc ublc of expcctcd frequencics is as follows :

60

40 40 EO

100 t00 200

o E @- q2 @-Dz/E
65 60 25 0.4t7
35 40 25 0.625
25 0.417 Tllcrlbl
55 60
25 0.625 tlr null hypod
45 40
= 2.084 (O ftt
---t an actul sr
tcst of good
*r= r(o-il' =r.n o
suchasBiu
rlt x if we' We hypobc
may bc mtcd 0rat it is not necessary to calcutatc dl thc cxpcctcd fre4ucncics. tt woutd bc arorgh in a 2 2 tablc,
calgulatc only onc ccll cxpectcd ftequacy. Thc otlrcrs can be obtaincd by thc proccss ofdcductiolt. sarnplc cErr
Chi.SguareTest 549

x2=8ry=2.084 ' .1r


v= (r- l)(c- l)=(2- l)(2- l)= I
For v =I, Xtg.6r= 3.84
-
Tlrc calctlatfi-valuc of X2 is tess than the tabte value. Thc null hyaothesis is aqcepted. Hcncc, thc drug is no cffccrirr in
t*"i,tllt
a'ffi 3. A c€rtlin &ug is claimod to be effective in curing cold. In an experiment on 500 pcrsons with col4 half of
thcm urat given thc dnrg andhalfofthem were given thc sugar pillsl The patients' reactiOns tio th€ trcluncnt are rccordcd in thc
following table : ,

Helpd Harmed No efect , Total


denu lfttrc cafcuhtcd Dnrg 150 30 70 250
Sugorpills 130 40 80 250
rypothesis holdsfire, Toral 280 ?0 150 500
: table valuc, the null Or tho basis of 0ris data, cur it be concluded that there is a sigrificont diffcrencc in the cffeEt of thc dnrg and sugar pills ?
'(MBA, Kunau Univ., 2N)2; .MBA, @A) DU,2002)
I wrnc givrn a drug urd thc Solutior. l,ct rs tatce thc null hypothesis tlrat thcrc is no diffcrcnce in the drug and sugar pills as far as thcir cffcct on curing

Sinccitisa2x3table,thedegrecsoffteedomwouldbc(2-l)(3-l)=1,;e.,wcwillhavetocaloulatcontytwocxpccted
frequcncics and other four can bc automatically determined.
Toral
Expcctcd frcqpcncics are computcd as follows :
t20
250 250
80
2N . Ell=
ff' zeo= l4o; E,r= # "o="
Thctablc of cxpccted frequcncics is :

illng 12"tcst: t40 35 75 250

t40 35 75 r 250

280 70 150 t00 I

Arnnging tlro obscrved rnd cxpcctcd frcqucncics inthc following tablc :

o E @-Ef @-Ef/E
t50 t40 t00 t o.?r4
130 r40 100 0,7t4
30 35 25 0.7t4
CI 35 25 0.714
70 75 25 '0.333
80 75 25 0.333

.. zt/@-q2/q=3-522
@-q2/E
@-r)?
0.417 t2= E E
= 3.522
0.625
0,.417 Tlrtablcofffor Zdgatlo/olewlofsig4ificatrcc is 5.g.Thccdculatcdvalucofl2is lcssthan 0re tablcvatuc. Thereforc,
0.625 tlrc null hpothcsis is acc,jcptcd. flcncc, wc @ncludc that thc drug and sugar pills do not differ significantly_ in curing cold.

=2,0t4 (5) ftst


of Goodngrs of Fit Tess of godness of ht are used when we want to determine whether
an actual ssmple dtstibution matchcs a knorvn theoretical disfribution. f test is popularly known as a
test of goo&rcss of fit for the neason, that it cnablcs ns to ascertain how well the theorctical dishibutions
such as Binomial, Pnisson, Normal, ctc., fit empirical disribution, i.e., those obtained from sample data.
rughina2 x 2tabh,ifwc We hypothcsizc a thcorctical distibution (Normal, for example) and tlren test to determine whether our
sarnple cario from or is comparablc to ttre theoretical distribution. Ifthere is a high degrce of conformity
q

550 Businessstatistics
r-ilF
between ttre two distibutions, any slight difiererricg rnay be assumed to be the result of sampling variatiut
On the other hand, any large discrepancy between the two distributions may lead to the conclusion thA
the sample was drawn from some theoretical distribution other than the one proposed.
While applying the chi-square test of goodness of fit, the null hypothesis usually states that thc
sample is ararvn-tom the theoietical population dis[ibution,and the altemate hypothesis usually states
that jtis noL The 6lbw 5ng iursbatbns w ouH ilbsaE the ue of 12 test of'goodness of fit -
Itturtrrtion 4. The number ofparts for a particular spare part in a faclory was found O vary from day to day. In a samplc
study, thc following informatioh was obtained :
Day : Mon. Tues. Wed. Thun. Fri. Sat. Total
No. of parts
demirnded : ll24 ll25 lll0 tl20 r 1126 lll5 6720
i.rittr. t yp"tfresis that the number of parts demanded docs depcnd on the day of thc wcek. iUAl, *ni Untv., 2000)

Solutior. l,et null hypothesis that the numbcr ofports demandcd docs depcnd on thc day ofthcweck.
us take the Tblsl
Thc numbcrof sparcparts demanded in aweek arc 6720 and if alt da)E ano sam€, wE should cxpect 6720t6,i.e.,ll20spare
-lElrEdE.TI
parts on a day ofthc week. IhcFl
Doy o E @- q2 @-q2/E llrhtirn h
(a) I
Monday fi24 I t20 t6 0.0t4
(r) 2
Tuesday n25 I t20 25 0.022
Do ,ul
Wednesday lIl0 I t20 100 0.089
ll20 ll20 0 Solrdoi
Thunday
Friday n26 fi20 36 0.032 mr.Locd by t
'0.022 Sincc dl
Saturday lr r5 I t20 25

2,1(o - E)zttr1=o.tlg

Thetablevalueof[2for5 d.f.atSo/olevelofsignihcanccisll.0?.Thecomputedvalueofl2ismuch-lcssthanttrctablc
va[ue. The null hypottresii is accepied and we conch& that the demand for spare parts is dependerit on the day ofthe weelc.

. Illuotretion 5. A survey of 320 families with 5 children each, rcvcalcd thc following distribution
"4
:

No.ofboys : 5 3 2 | 0
No.ofgiils : 0 I 2 3 4 5
No: of families : 14 56 ll0 . 88 40 ' 12
Is this rpsult consistcnt with the hypo0resis that mrlc and fcmale births arc equally probEblc? (MBA, lGNOU.,2ffiD
Soluthl. [.ctus takettrc null hypothesis onthe assumption thaimale and femalc birtlrs arscqually probablc,lhcptobibitity
of amJe birth isp=ln.Thc expccted numbqof familics can !e oatculated bythc,usebf binomial distribution. Thcprobability
ofr male birtlrs in a family of 5 is given by
I@)=sC;l qsn [forr=,0,1,2;3,4,51
ser{%)i l:' P= q ='al
' To gct tlrc expected frequencies, multipty/1.r; by thc totrl numbcr rV = 320. The calculations are shown bclow in"thc tablc :

t
sco(,rf = mz
v- 13-
0 320x 1/32=10 Crs$a).
Thctrb
x 5R2- 50
'c, FJ = 5R2
320
Eclhcft
=tot32 320 x 10/32= 100 (6) T(
2
'.r[iJ'
Dopulatio
* ro^2 320 x 10/32= 100
3 '",((1I be intcrcs

x Anothor t
=snz 320 5R2= 50
4 'co(|f rhat scvel
320x l/32-10 sever.al ct
5 .'c,(if = inz
}J

Chi-SquareTet 551

errlrgioS obEcrvcd rnd expcctcd frcqucnoies in thc folloring tablc and calculain! 12 :
f samplingvariation.
r the conclusion that o ,E (o- (o - E)2/E
rcd. l4 t0 l6 1.50
;ually states that the s6 JO 36 0.72
thesis usuallY sates rt0 t00 100 1.00
ress of fit. 88 100 144 t.44
40 50 t00 2.00
n day to dsY. In r samPlc
t2 l0 4 0.40

EI,@'cfrsl-1.15

|MBA, Delhi Untv., 2000) xtoz # -7.1g


day of thc,weclc. Thc trblc valrr of12 for v- 6- I - 5 rt 57o lcvcl ofsignirtcrncc is 11.07. Thc connputcd rraluc of 12 = 7.16 is lcss rlun tlrc
a672016, i.e., [120 sParc oblc rnlrr. Thacfoq tho tutl hypolhclir b rcocptod. Thus, it can bc oonctudcd 0ut malc urd fcmrlc birdrs arc cquetty pobablc.
Illurtntlor 6. Thc figurcr givsr bclow rrc (a) trc ttrooraircal frcqucocics of a disrib.utim arrd (r) ltr ficquccrcics of thc
@-q2/E Jistibulm brving thc;rrnc ncuU rtardrrd dcvi*ion ud total frraquarcy as in (a) :
(a). I 12 66 220 4g5 7g2 n4 7n
4g\5t 220 6 li I
0.014
0.022
(r) 2 15 6 210 484 79
943 7gg' 48il 2tO 166 ,t5 2
0.089 Do pu thir* thrt lhc normd disribution providcs a good fit tc\ thc data ?
Sol(bu [.ct us takc thc null hypo0rcsis 0rat thcre is no diffcrence in the obscrvcd frequcncics and cxpcctcd frcqucncics as
0.032 $taincd by thc normal disuibution.
0.022 Sincc thc frcqucncics at thc tuo comar arc lcss than 5, thcy would bc combincd with thc adjaccnt frcqucncy.'
'.1(o - E12tg1*9.119 o E @-q2 (o-q2/E
b much lcss than ttrc tablc I 2 l6 0.941
oo the day of the wcck. t2 t5 )
m: 66 6 0 0.000
t20 210 t00 0.476
495 4t4 tzl .0.250
792 199 49 0.061
(MBA,IGNOU.,2M2\ 921 943 36t 0.383
lv probabte thc ProbabilitY 192 799 49 0.061
istriUution. Thc ProbabilitY 195 484 [21 0.250
220 2t0 l$0 0.416
[for I =.0,1;2;3,4,5] 66 6 0 0.000
l:. p- q =t/tl t2 t5
c shown b€low inthc tablc : I .2 ) l6 0.9f 1

E(o- grztq-lelg
d tGl
v- 13-2-3 -S (rftcrgruryiry, ll clrrsg arc lcft ud for normat tlrc &grrcs of frccdom is lcss by 3 0ranthc numbcrof
0 x l/32= l0

Sx5/32-50
Thc t$h tntuG ot* W U.l *5% tcnol of sigrificlrcc is 15.5 l. Thc crlcutatrd valrr of f is hss thm thc t$li vrluc ud I

lhc ftt ir Sood

r0 x 10/32= 100 (6) Tcrt.of Homogonclty. It is frequontly of interest to explore the proposition that several
arc honrogcnoous with respect to some characteristic of interest. For examplc, we may
|0 x tU32= lfi] intcrcstcd in knowing of somc raw matcrial availablc from s€veral rctallers is homogeneous.
wsy of statting thc problcm is to sa! that we arc ititercstcd in testing the null hypothesis
l0 x 5/32=50
hat sevcral populations arJ homogencous with rtspect to the proporrtion of subjcct falling into
l0x l/32-t0 leveqal cat:gorics or somc other criterion of classificstion. A ranclom sampld is drawn from each
552 Business Statistics

of the population and the number in each sample falling into each category is determined. Th€
sample data is displayed in a contingency tabte. The anatytical procedure is same as that discussed (1

for test of independence. (,i


The main diff.r.nq, is that,.in tests of independence, we ,are concerned with the problem
(iii
(rvl
- the two attributes are independent or not while in tests of homogeneity, we are concerned
whether
(v)
whether the different samples come from the same pbpulation. Another difference is that test of
(vt,
independence involve a single sample but test of homogeneity involvtis two or more samples, one
.from each population. When there are two populations involved, and when the characteristics of-
Its
parame
interest consist of two categories, the test of homogeneity is the same as testing hypothesis about
the difference between two population's proportions which was discussed in the chapter on tests
tffilret:ii Ilk
"t ,. o r*oo* sampte of 400 pe$ons was setected from each of rhree age groups and each person wasasked to-
and ofth
speci$ which of three types of TV programmes be prefem/. Ttre rssults arq shown in the following tablo :
Sol

Age group A B c Totol


A
Under 30 t20 30 50 200 N
3H4 l0 15 r5 100
T
45 and above l0 30 60 t00
L€
Total r40 135 125 400

Test the hypothesis that the populations are homogeneous with respcct to the types of television prognmmc thcy prefcr. coresp{

Solution. Let us take the null hypothesis that the populations are homogeneous with rcspect to different tlpes oftclcvision
progr.unmes they prefer.

o E @- D2 @-a2/E
t20 70,00 2500.00 35.7r43
l0 35.00 625.00 17.8571
t0 35.00 625.00 17.857r
30 61.50 1406.25 20.8333
75 33.75 170t.56 50.4t66
30 33.75 t4,06 0.4166
50 62.50 t56.25 2.5000
l5 31.25 264.06 8.4/}99
60 3 t.25 826.56 26.4499

El/@-E)z/q= 180.4948

x2= Z *Y=r8o'4e5,
E,
The table value off for 4 d./ at SYolevel ofsignilicance is 9.488. F
The calculated.value of 262 is much greaterthan the table value. We rcject the null hpothesis and conclude that the populations T
are not homogeneors with respect to the type of TV programmcs prefcned.
att6ck

Cautions white Applying 12 Test I


. X2 test is very popularly used in practice. However, it is unfortunate to find that the number of
68 ses
with ft
misuses of.12 test has become surpisingly large. The test.must be used with greater care, keeping in s
mind the assumptions on which it is basdd. Some souroes of error in the application of this test revealed ratio 2
by a survey of a1l papers published in th6 joumal of Experiment Psychologt are : 200 r
t0

L r'

You might also like