0% found this document useful (0 votes)
8 views9 pages

Part I (25%)

This document is an examination paper for the course STAT1304: Design and Analysis of Sample Surveys at the University of Hong Kong, dated May 14, 2013. It includes instructions for candidates, questions divided into two parts covering various statistical concepts, and methods related to sample surveys, including estimation techniques and data analysis. The paper assesses knowledge on topics such as sample survey design, statistical data interpretation, and the application of different sampling methods.

Uploaded by

nieyangyang9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views9 pages

Part I (25%)

This document is an examination paper for the course STAT1304: Design and Analysis of Sample Surveys at the University of Hong Kong, dated May 14, 2013. It includes instructions for candidates, questions divided into two parts covering various statistical concepts, and methods related to sample surveys, including estimation techniques and data analysis. The paper assesses knowledge on topics such as sample survey design, statistical data interpretation, and the application of different sampling methods.

Uploaded by

nieyangyang9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

THE UNIVERSITY OF HONG KONG

DEPARTMENT OF STATISTICS AND ACTUARIAL SCIENCE

STAT1304 DESIGN AND ANALYSIS OF SAMPLE SURVEYS

May 14,2013 Time: 2:30 p.m.- 4:30 p.m.

Only approved calculators as announced by the Examinations Secretary can be used in


this examination. It is candidates' responsibility to ensure that their calculator operates
satisfactorily, and candidates must record the name and type of the calculator used on
the front page of the examination script.

Answer ALL questions in Part I and Part II. Please use separate answer books
for Part I and Part II. Marks are shown in square brackets.

Part I [25%]

1. Someone requires certain items of statistics and asks you for advice on whether a
sample survey should be conducted to obtain the statistics. What matters will you
raise for discussion with that person before you give your advice?
[Total: 4 marks)

2. Statistical Authorities conduct sample surveys of business firms on their business


operations. Mention FOUR items of statistical data which business managers will
find useful in making business decisions.
[Total: 4 marks)

3. a) The Government of a territory often conducts two series of surveys in parallel in


respect of the subject of labour: the Labour Force Survey and the Survey of
Employment. The former is conducted on households and the latter on
establishments (both business establishments and other organizations). It is said
that these two series of surveys will together provide a rather complete picture
of the labour market situation of the territory. Explain why this is so.
[5 marks)
b) Based on the Labour Force Survey of a territory, the Labour Force Participation
Rate (LFPR) is 65%. It is also known that the size of the population of the
territory aged 15 or over is 5.6 million.
i) What is the size of the Labour Force? [2 marks)
ii) For females aged 25-34 in the territory, the total number of persons is 1.2
million and the no. of economically inactive persons is 0.48 million. Also,
the no. of unemployed persons for this sex-age group is 28 000. What is the
unemployment rate of this group of people? [2 marks)

[Total: 9 marks)
S&AS: STAT1304 Design & Analysis of Sample Surveys 2

4. Two series of surveys need to be conducted in order to provide data for the
Statistical Authority to compile the Consumer Price Index (CPI).

a) What are the two series of surveys? What are their respective aims?
b) With Base Year 2005, the CPI of a certain Territory were:

Year Index
2006 103
2007 105
2010 114

A re-basing was done in 201 0. The index for the year 2011 under the new base
was 103.

i) What was the CPI for 2011 under the OLD base?

ii) What was the CPI for 2007 under the NEW base?

[Total: 8 marks]
S&AS: STAT1304 Design & Analysis of Sample Surveys 3

Part II [75%]

1. A standard quality control check on automobile batteries involves measuring their


average lifetimes. A shipment, weighed 65,000 pounds, from the manufacturer
consists of I 000 automobile batteries. An investigator plans to sample n = 6
batteries and measures their weights and lifetimes as she believes there is a high
correlation between the battery weight and the average lifetime. A simple random
sampling of battery weights and lifetimes in month A yields the following
measurements.

Let
X;= weight of battery i
Y; =lifetime of battery i
Table I: Battery weights and lifetimes in month A
Battery Weight (pounds) Lifetime (hours) Yi -TXi

1 61.5 1180 -21.9942


2 63.5 1250 8.9165
3 63.5 1245 3.9165
4 64.0 1245 -5.8558
5 63.8 1248 1.0531
6 65.8 1300 13.9639
Mean 63.6833 1244.6667 0.00
SD 1.3732 38.1663 12.7199
Sum 382.1 7468

a) Estimate the average battery lifetime y using ordinary estimation. Place a


bound on the error of estimation with 95% confidence level. [5 marks]

b) Estimate the average battery lifetime :Yr using ratio estimation. Place a bound
on the error of estimation with 95% confidence level. [5 marks]

c) Estimate the average battery lifetime :Yrr using regression estimation. Place a
bound on the error of estimation with 95% confidence level. [5 marks]

d) Comment on the precision of the three methods. Show that the regression
estimator is at least as efficient as the ordinary estimator. [5 marks]

[Total: 20 marks]
S&AS: STAT1304 Design & Analysis of Sample Surveys 4

2. The investigator in Part II Question 1 decides to stratifY by months in the sampling


inspection in order to observe month-to-month variation. Stratified random
sampling of battery weights and lifetimes for months A and B yield the following
measurements. The weight of shipment is 65,000 pounds in month A and 68,000
pounds in month B, respectively.

Let
xi =weight of battery i
Yi = lifetime of battery i
Table 2:
Battery weights and lifetimes in month A Battery weights and lifetimes in month B
weight lifetime weight lifetime
Battery (pounds) (hours) Yt -TXt Battery (pounds) (hours) Yi- rxi
1 61.5 1180 -21.9942 1 62.2 1180 -19.6053
2 63.5 1250 8.9165 2 63.8 1240 9.5367
3 63.5 1245 3.9165 3 63.5 1243 18.3226
4 64.0 1245 -5.8558 4 66.5 1280 -2.5362
5 63.8 1248 1.0531 5 68.5 1310 -11.1087
6 65.8 1300 13.9639 6 69.2 1340 5.3909
mean 63.6833 1244.67 0.00 mean 65.6167 1265.50 0.00
SD 1.3732 38.1663 12.7199 SD 2.8771 56.9342 13.9279
sum 382.1 7468 sum 393.7 7593.0

a) Estimate the stratified average battery lifetime Yst using ordinary estimation.
Place a bound on the error of estimation with 95% confidence level.
[5 marks]

b) Estimate the average battery lifetime Yst_r using combined ratio estimation.
Place a bound on the error of estimation with 95% confidence level.
[5 marks]

c) Estimate the average battery lifetime Yst_lr using separate regression


estimation. Place a bound on the error of estimation with 95% confidence level.
[5 marks]

d) Comment on the precision of the three methods. Under what condition is the
combined ratio estimator more efficient than the separate ratio estimator?
[5 marks]

[Total: 20 marks]
S&AS: STAT1304 Design & Analysis of Sample Surveys 5

3. a) The following table shows the number of births (in thousands) and the birth
rate (in births per thousand of population) in the United States for a systematic
sample of years between 1950 and 1990.

Year Births Rate


1950 3632 24.1
1955 4097 25
1960 4258 23.7
1965 3760 19.4
1970 3731 18.4
1975 3144 14.6
1980 3612 15.9
1985 3761 15.8
1990 4158 16.7

i) Estimate the total number of births during this 41-year period. Find an
appropriate estimate of the variance. [5 marks)

ii) Estimate the mean birth rate during this period and fmd an appropriate
estimator of the variance. Referring to the trends of the data, is the mean
birth rate a good predictor of the birth rate for 1995? Explain your answer.
[5 marks)

b) An auditor is confronted with a long list of accounts receivable for a firm. She
must verifY the amounts on 10% of these accounts and estimate the average
difference between the audited and book values. Conunent on her choice of
using simple random sampling, stratified random sampling, systematic
sampling or cluster sampling for the following situations.

i) The accounts are arranged chronologically;

ii) The accounts are arranged randomly;

iii) The accounts are grouped by department and then listed chronologically
within departments.
[10 marks]

[Total: 20 marksI
S&AS: STAT1304 Design & Analysis of Sample Surveys 6

4. A newspaper wants to estimate the proportion of voters favoring a certain candidate,


in a statewide election. Cluster sampling is used to minimize the cost of conducting
the survey. A sample random sample of 16 precincts is selected from the 497
precincts in the state. The newspaper wants to make the estimation on election day
but before final returns are tallied. Reporters are sent to the polls of each sample
precinct to obtain the pertinent information directly from the voters. The results are
shown in the following table.

Number of Number favoring


voters candidate
(mi) (ail ai- fimi
1 1290 700 -39.00
2 1170 631 -39.25
3 840 475 -6.21
4 1620 935 6.96
5 1020 621 36.68
6 1492 820 -34.71
7 1893 1143 58.57
8 1942 1187 74.50
9 971 542 -14.25
10 1873 1100 27.02
11 1141 642 -11.64
12 843 560 77.07
13 1066 600 -10.67
14 1171 596 -74.82
15 1213 782 87.12
16 1741 860 -137.36
Sum 21286 12194 -8.527E-13
Mean 1330.375 762.125 -5.329E-14
SD 377.6649 225.6820 59.92

a) Comment on why cluster sampling is more appropriate than simple random


sampling in this situation. [5 marks]

b) Estimate the proportion of voters favoring the candidate and place a bound on
the error of estimation with 95% confidence level. [5 marks]

c) The newspaper wants to conduct a similar survey during the next election. How
large a sample will be needed to estimate the proportion of voters favoring a
similar candidate with a bound of 0.05 on the error of estimation?
[5 marks]

[Total: 15 marks]

************END OF PAPER************
S&AS: STAT1304 Design & Analysis of Sample Surveys 7

Simple Random Sampling


Parameter Point Estimate Estimated Variance
Mean Y n s2
y=~1>i var(Y) = (1--) 2.
N n
Total r f= Ny n s2
var(i) = N 2 (1--)2.
N n
Proportion P P=p
var(P) = (1- n) p( 1 - p)
N n-1
1 n s2
Ratio R=~X var(r) =-=--(1--)_::.
y X2 N n
r=-::
X
Sr2 = -1-
n-1
I (Y; - rx;) 2

=s~- Zrpsxsy + r 2 s;

Mean Y=RX Y,.=rX n s2


var(yr) = (1- N) ~

Total r=NRX f= NXr ___ ,.. 2 nsr2


var(r)=N (1--)-
r N n
Regression y Ytr=y-{J(x-X) varCYtr)
n {s~- 2f3sxy + f3 2 s;}
= (1--)
N n

Stratified Simple Random Sampling


Parameter Point Estimate Estimated Variance
2
Mean Y
- - = L W 2 (1--
var(y) nl )sl-
y= LWtYl 1 N1 n 1
Total r 2
i = LNtYi - ,
var(r)= L N 2 ( 1 -nl- )sl-
1
N 1 n1

var(fi) = L W?(1- ~ C - ~a
Proportion P 11
fi = IwlPl
l
l nl-
Mean using 2
y = L wlrlxl var(yr) = L W?(1- nl
N )sl-
separate ratio 1 n1

estimate
Y;
sl2 = 1
n 1 -1
I (Yu - TtXu) 2
Rt ==
xl 1
=- - Cl:
nr-1
yfi - Zr1 L XuYr; + Tt2 L x/;)
S&AS: STAT1304 Design & Analysis of Sample SuiVeys 8

Mean using
combined ratio
y=rX - - =I
var(yr) W 1z(1- nl
N )sl-
1 n1
2

estimate
y = I
sl2 1
nl- 1
-
(Yli - Yl
R--
-x - r(xli- ii))z

Mean using
separate
Ylr =I Wj[Yj var(y1r) =I 2
Wj [s;j- 2/JjSxyj

regression
estimate
- /Jj(xj
-xj)l
+ /3252·]1- [j
J XJ n·
}

Min{var(Ylr,s)}

=I Wj2(1

1
-p · S ·-[j)
2)2( --
J YJ n·}

Mean using Ylr = Y- fJ(x- X)


varCY1r) =I 2
Wj [s;j- 2/Jsxyj
combined
regression
estimate
fJ = Ipjqj I I qj
+ pzsz]1- [j
XJ n·}
M in{ varCYLr,c)}

where q·J = W·
2
SZ._J
1-f· = Min{var(Ylr,s)}
J XJ nj

+I qj(fJj- /3)2

Systematic Sampling
Parameter Point Estimate Estimated Variance
Y n s2
Mean
Ysy =~IYi =
var(Ysy,srs) (1 - N) ~
Population total r Tsy = N)isy
-c-) =
var Tsy
2
N 2(1- N)-;
nsy

Proportion P fi = Psy =
var(fi) (1- n) Psy(1- Psy)
N n-1
Repeated systematic
sample mean y Ysy,rep =: I s
Yl =
var(Ysy,rep)
n s?o
(1- N):
s
Half-sample Ysy,half -c- )= -
var Ysy,half
(1 n)(y1-yz)2
N
= +
1
2CY1 Yz)
4

2
Difference
=~IYi
Ysy,diff
_ _
var(Ysy,diff) = n 5 diff
(1 - N) ~
S&AS: STAT1304 Design & Analysis of Sample Surveys 9

Cluster Sampling
Parameter Point Estimate Estimated Variance
Ratio Ln Lm'Yij 1 n s2
Ycl.r = Ln mi var(:Yc~,r) = fJZ (1 - N) ~

2=
I: mi2 cYi- Ycl,r
- )2
Sr
n-1
Equal sizes cluster Ln Yi
- 1 n s2
sample mean Y Ycl.eq = -n- var(yc1,eq ) = -m2( 1 - -)---"-
N n
I: m 2 cYi- Yc!,r
- )2
s2-
e-
n-1
Probabilities
Proportional to Size Ypps = ~LYi var(ypps) = n(n ~ 1) I C:Yi
~
-ypps )2
Unbiased Ln Yi n s2
Yt=-- var(:Yt) = (1 --)...!..
n N n
Ln (Yi - Yt) 2
s2-
t -
n-1

Sample size detennination


SRS, Sys & Cluster Stratified SRS Sys. (Rep) given n 5
NS 2 L NlSl S-2
n;:::: Ndz + S2 N...Z -Nd 2
n > wl ns
- N 2d 2 +I; N1S[ n2:
S-2
...z
ns
Allocations
Optimum Neyman Proportional

=n(t~)
Stratum size I N1S1 \
n 1 = n ( N1S1 ) n1
nl
n1 = n Fz L NiSi
I: NiSi
\ Fz;
Sample size n n n
n I:( N1S1jfz) I;(N1Sd jfz) > [I:CNtsaF Nl:NtSf
> 2 > 2 2
- N 2d 2 +I: N1S/ - N d 2 +I: N1S/ - N d +I; N1S/

You might also like