Sampling Practicals
Sampling Practicals
BATHINDA
SAMPLING THEORY
PRACTICALS
SUBMITTED TO
Dr. SANDEEP KAUR
SUBMITTED BY
ARYA M P
20msstat22
MSC STATISTICS
1) Generate a data set of 200 population units from Normal distribution
with mean 12 and standard deviation 3.04. Select ten random samples
each of size 15 to show that the estimated mean is an unbiased estimator
of the population mean.
INPUT:
pop=rnorm(200,12,3.04)
pop
nrep=10
smeans=rep(NA,nrep)
for(i in 1:nrep)
{
s=sample(pop,15)
smeans[i]=mean(s)
}
smeans
pmean=mean(pop)
pmean
smean=mean(smeans)
smean
OUTPUT:
2) The yields in quintals for wheat crop of 60 villages in a certain Tehsil are
given below.
Select a simple random sample of size 30with replacement method and of
20 without
replacement method respectively. Estimate their respective average yield
per plot along
with their standard errors.
20 64 24 19 29 29 26 32 28 35
21 42 32 19 19 19 21 40 18 27
32 28 75 16 37 27 45 63 24 35
41 35 28 28 34 42 61 30 32 25
55 25 29 30 35 39 16 21 23 29
22 25 38 29 31 11 29 35 8 29
INPUT:
pop=c(20,21,32,41,55,22,64,42,28,35,25,25,24,32,75,28,29,38,19,19,16,28,30,29,29,19,37,34,35,31,
29,19,27,42,39,11,26,21,45,61,16,29,32,40,63,30,21,35,28,18,24,32,23,8,35,27,35,25,29,29)
pop
s1=sample(pop,30,replace=TRUE);s1
s2=sample(pop,20,replace=FALSE);s2
v1=var(s1);v1
v2=var(s2);v2
m1=mean(s1);m1
m2=mean(s2);m2
sd1=sqrt(v1);sd1
sd2=sqrt(v2);sd2
se1=sd1/sqrt(15);se1
se2=sd2/sqrt(10);se2
N=60
n1=30
n2=20
popvar_wr=((N-1)/(N*n1))*v1;popvar_wr
popvar_wor=((N-n2)/(N*n2))*v2;popvar_wor
OUTPUT:
3) From population data on a socio economic survey grouped into four
strata on the basis of certain characteristics, a sample was selected using
stratified sampling in a manner such that from each stratum, 10 villages
were selected with SRSWOR. The data set on the number of household in
each of the sample villages are given as below:
Stratum Total no of Total No of households in sample villages
no villages 1 2 3 4 5 6 7 8 9 10
1 1411 43 84 98 0 10 44 0 124 13 0
2 4705 50 147 62 87 84 158 170 104 56 160
3 2558 228 262 110 232 139 178 334 0 63 220
4 14997 17 34 25 34 36 0 25 7 15 31
Estimate the average number of household along with the estimate of its
variance.
INPUT:
data<-read.csv(file.choose())
data
attach(data)
stratum
households
y1=households[stratum==1];y1
y2=households[stratum==2] ;y2
y3=households[stratum==3];y3
N1=1411;N2=4705; N3=2558
meany1
meany2
meany3
N=N1+N2+N3;N
ybar=(N1*meany1+N2*meany2+N3*meany3)/N
ybar
n=10
vy1
vy2
vy3
10)/(N3*10))*W3*vy3))
Estimate_var
OUTPUT:
4) The number of cows in milk enumerated (Y) from a random sample of 20
villages from a tehsil having 84 villages, as also the corresponding census
figures (X) in the previous year, are given below:
villages y x
1 237 155
2 1060 583
3 405 205
4 1080 738
5 666 526
6 542 284
7 1337 758
8 1166 681
9 399 143
10 228 111
11 813 616
12 666 576
13 681 540
14 2743 2242
15 1228 940
16 472 387
17 643 675
18 180 220
19 583 654
20 1195 1787
Given that the census estimate of the number of cows in milk in the tehsil
was 74488, estimate the number of cows in milk in the current year by
using ratio and regression method of estimation and also find the estimate
of its variance for the same.
INPUT:
datas=read.csv(file.choose())
datas
attach(datas)
ybar=mean(y) ;ybar
xbar=mean(x);xbar
r=ybar/xbar ;r
Sx2=var(x)
Sy2=var(y)
Sx2
Sy2
Sxy=cov(x,y) ;Sxy
n=20
N=84
X=74488
Ratio_mean=((r*X)/N)
Ratio_mean
var_ycapratio=((1/n)-(1/N))*{Sy2+(r*r)*Sx2-(2*r*Sxy)} ;var_ycapratio
X_bar=X/N ;X_bar
beta=Sxy/Sx2 ;beta
y_bar_reg=ybar+beta*((X_bar)-(xbar)) ;y_bar_reg
rho_cap=cor(y,x)
var_reg=(((1/n)-(1/N))*(1-(rho_cap*rho_cap))*Sy2) ;var_reg
OUTPUT:
5) Estimate value of y , where y is given as follows
INPUT:
y = c(1, 50, 21, 98, 2, 36, 4, 29, 7, 15, 86, 10, 21, 5, 4)
n = length(y)
n
N = 286
ybar=mean(y)
ybar
varybar=(1 - n/N) * var(y)/n
varybar
esty=N * ybar
varesty=N^2*varybar
seesty=sqrt(varesty)
seesty
OUTPUT:
6) computations for the ratio estimator are done below, using the R data set
“trees.” The variable of interest is volume and the auxiliary variable is
girth or circumference.
INPUT:
y <- trees$Volume
x <- trees$Girth
s <- c(11,4,29,27)
s
N <- 31
n <- 4
y[s]
x[s]
r <- mean(y[s])/mean(x[s])
r
mux <- mean(x) ;mux
muhatr <- r * mux ;muhatr
ssqr <- (1/(n-1))*sum((y[s] - r*x[s])^2)
ssqr
OUTPUT:
7) With N=286 find sample mean, An estimate of population total, sample
variance, estimate of variance of sample mean and standard error of x.
INPUT:
x=c(1,50,21,98,2,36,4,29,7,15,86,10,21,5,4)
n=length(x);n
mean(x) #sample mean
p=N*mean(x);p #estimate of population total
var(x) #sample variance
#estimate of variance of sample mean
y=x-mean(x);y
b=y^2;b
sum(b)
c=sum(b)/n;c
d=(N-n)/N*n;d
e=d*c;e
#for standard error
sd=sqrt(var(x));sd
se=sd/sqrt(length(x));se
OUTPUT: