0% found this document useful (0 votes)
0 views

02 Simple Random Sampling

Simple random sampling is a technique where each unit of a population has an equal chance of being selected, with two methods: sampling with replacement and without replacement. The document outlines properties of simple random sampling, methods for drawing samples such as the lottery method and random number tables, and techniques for estimating population parameters. It concludes with a theorem stating that the sample mean is an unbiased estimator of the population mean.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

02 Simple Random Sampling

Simple random sampling is a technique where each unit of a population has an equal chance of being selected, with two methods: sampling with replacement and without replacement. The document outlines properties of simple random sampling, methods for drawing samples such as the lottery method and random number tables, and techniques for estimating population parameters. It concludes with a theorem stating that the sample mean is an unbiased estimator of the population mean.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

02 Simple Random Sampling

 Simple random sampling:


Simple random sampling is the technique of drawing a sample in such a way that each unit of the
population has an equal and independent chance of being included in the sample.

 Simple random sampling without replacement:


If a unit is selected and noted and not returned to the population and this procedure is repeated
till n distinct units are selected, then it gives rise to a simple random sample of n units, which
is called the simple random sampling without replacement.

 Simple random sampling with replacement:


If a unit is selected and noted and then returned to the population before the next drawing is
made and this procedure is repeated n times, then it gives rise to a simple random sample of n
units, which is called the simple random sampling with replacement. Occasionally, sampling
with replacement is referred to as unrestricted sampling.

 Properties of simple random sampling:


1) In sampling with replacement, the probability of selecting a specified element ui

on each of the n draws from a population of N elements is:  


p ui 
1
N
; i  1,2,..., N .
2) For sampling without replacement, the possible number of different combinations
N
of n elements formed from N elements is:  . For sampling with replacement,
n 
the possible number of different combinations of n elements formed from N
elements is: N n .
3) In sampling without replacement, the probability of selecting a specified element
u i on any draw is equal to the probability of selecting it on the first draw.
To verify this, we note that the probability that the specified unit is
selected on the first draw is: p1  1 . The second draw is conditional upon the first
N
draw, since the sample is being drawn without replacement.
Thus, the probability that the specified unit is selected on the
second draw is clearly the product of the probability of the event A , which is not
selected at the first draw and the conditional probability of the event B , which is
 N  1  1  1
selected at the second draw. That is, p2     . Similarly, the third
 N  N  1  N
draw is conditional upon the two previous draws.
Thus, the probability that the specified unit is selected on the third draw is
clearly the product of the probability of the event A and B , which are not selected
at the first two draws and the conditional probability of the event C , which is
 N  1  N  2  1  1
selected at the third draw. That is, p3      .
 N  N  1  N  2  N
Similarly, for the r th draw, the probability is:

1
02 Simple Random Sampling

 N  1  N  2   N  r  1  1  1
pr    . . .   
 N  N  1   N  r  2  N  r  1  N

This shows that the probability of selecting a specified element u i on any


draw is equal to the probability of selecting it on the first draw irrespective of
whether the elements are drawn with replacement or without replacement.
4) In simple random sampling of n elements drawn from a population of N
elements, the probability of any of the elements being selected is n .
N
To verify this, we note that the probability of any of the elements being
selected is the sum of the probabilities that is selected at the first draw, second
draw and so on. That is, 1  1  . . .  1  n .
N N N N
5) In simple random sampling, each possible combination of n different elements
drawn from a population of N elements has the same probability of being selected
1
for the sample and is equal to: .
N
 
n 
To verify this, we note that at the first draw, the probability that one of the
n specified units will be selected is n . At the second draw, the probability that
N
n 1
one of the remaining  n  1 specified units will be selected is . At the n th
N 1
n   n  1 1
draw, the probability is  .
N   n  1 N   n  1
N
So, there are   combinations of elements, any one of which may
n 
constitute the sample. Then, each combination has the probability of selection
equal to
n n 1 1 n  n  1 . . . 1  N  n  !  n!  N  n  !  1 . In
  . . .  
N N 1 N   n  1 N  N  1 . . . N   n  1  N  n  !  N  n !  N 
 
n 
arriving at the above probability, we have assumed that sampling was done
without replacement.
6) When sampling is done without replacement, the sample size n cannot exceed the
population size N , but when sampling is done with replacement, the sample size
n can be of any size.

 Drawing a simple random sample:


Drawing a simple random sample from a population requires that in every draw, each eligible
population element be assigned equal probability of selection. To ensure randomness in the
selection, the method of selection must be independent of human judgment as far as possible.
There are two basic procedures. They are
1) The lottery method.
2) The use of random number table.

2
02 Simple Random Sampling

 Lottery method:
The simplest method of selecting a simple random sample is the so called lottery method in
which each member of the population is identified by some means, such as by a marble, a disk, a
piece of paper and the like. The identifications are then placed in an urn or box and well mixed.
A sample of required size is then selected. The lottery method is illustrated bellow by means of
an example.

Suppose we want to select n candidates out of N . We assign the numbers 1 to N , one number to
each candidate and write these numbers on N slips, which are made as homogeneous as possible
in shape, size, color, etc. These slips are then put in a bag and thoroughly shuffled and then n
slips are drawn one by one. The n candidates corresponding to numbers on the slips drawn will
constitute a random sample.

This method of selection is quite independent of the properties of population. Generally in place
of slips, cards are used. We make one card correspond to one of the units of the population by
writing on it the number of the unit. The pack of card is a kind of miniature of the population for
sampling purposes. The cards are shuffled a number of times and then a card is drawn at random
from them. This is one of the most reliable methods of selecting a random sample.

Theoretically, the lottery method is free from human bias and thus ensures randomness.
However, the randomness of the lottery method depends on the assumption that the identifiers
(marble, disk or piece of paper) are thoroughly mixed so that the population can be regarded as
being arranged randomly. In practice, such satisfactory mixing is difficult to ensure and thus the
use of random numbers remains the only option for selecting sample.

 The use random number table:


The most practical and inexpensive method of selecting a random sample consists in the use of
random number table, which have been so constructed that each of the digits 0, 1 , 2, ..... , 9 appear
with approximately the same frequency and independently of each other.

In reality, a simple random sample is drawn unit by unit. If a list of the population units, that is, a
sampling frame is available, then the selection of a random sample may be easily accomplished
with the use of random numbers.

The units in the population are numbered from 1 to N . A series of random numbers between 1 to
N is then drawn by means of the random number table one after another. Once the first random
number is drawn, we may decide to proceed in any direction, vertically, horizontally, diagonally
or any other systematic way to obtain the remaining units in the sample.

At any draw, the process used must give equal chance of selection to any number between 1 to
N in the population. The units that bear these n numbers constitute our desired sample and we
technically call these n numbers a sample of size n .

It is important to keep in mind that whatever procedure is used, we must ensure that the numbers
so selected are all different and none are greater than the population size N .

3
02 Simple Random Sampling

The use of random number table involves a number of rejections since all numbers greater than
N appearing in the table are not considered for selection. The use of random numbers is,
therefore, modified and some of these modified procedures are:
1) Remainder method.
2) Quotient method.

 Remainder method:
Suppose that a simple random sample of fixed size is to be drawn from a population comprising
N units. Let this N be a r digit number and let the highest r digit multiple of N is N / . A
random number k is chosen from 1 to N / and the unit with the serial number equal to the
remainder obtained on dividing k by N is selected. The second and the subsequent units are
selected in a similar manner. If the remainder is zero, the last unit is selected.

For example, suppose that a random sample of size 5 is to be selected from a population of size
150 units. 150 is a 3 digit number and the highest 3 digit multiple of 150 is: 150  6  900 . A
random number 277 is chosen from 001 to 900. Divide 277 by 150. The remainder is 127. The
unit labeled 127 in the population is selected.

To select the second unit, choose the next random number. This number is 130, which is less
than 150. We directly choose this number as our second unit in the sample.

The next random number is 802, which results in a remainder 52. The unit corresponding to this
number is our third selected unit.

Continuing this process, we arrive at the next two numbers. These are 108 and 91. So, the
random numbers thus chosen are 52, 91, 108, 127 and 130. Had there been any number larger
than 900, we would have ignored it.

Note that the selection did not lead to any rejection of the random numbers. That is, all the first 5
random numbers had been possible to be included in the sample without any rejection.

 Quotient method:
Suppose that a simple random sample of fixed size is to be drawn from a population comprising
N units. Let this N be a r digit number and let the highest r digit multiple of N is N / such that
N/
N
 q. A random number k is chosen from 1 to  N /  1 and the unit with the serial number
equal to the  quotient  1 obtained on dividing k by q is selected. The second and the subsequent
units are selected in a similar manner.

For example, suppose that a random sample of size 2 is to be selected from a population of size
16 units. 16 is a 2 digit number and the highest 2 digit multiple of 16 is: 16  6  96 such that
96
 6 . A random number 65 is chosen from 01 to 95. Divide 65 by 6. The quotient is 10. The
16
unit labeled 9 in the population is selected. The second unit is selected in a similar manner.
4
02 Simple Random Sampling

 Estimation of population mean and total:


The frequent objective of a sample survey is to estimate the population mean and population
total to draw inference about a population from information contained in a sample. Suppose, the
values obtained for any specified items in the population of size N is denoted by: y 1 , y 2 , . . . , y N .
The corresponding values for the units in the sample of size n is denoted by: y 1 , y 2 , . . . , y n . The
formula used for the totals and means of the population and sample are summarized below:

N n
Y   y i  Population total y   y i  Sample total
i 1 i 1
N n
 yi Y
 yi y
Y  i 1   Population mean y i 1   Sample mean
N N n n

 Y  N Y  y  n y  Yˆ  N y

N
 yi Y 
2

 2
 i 1
N
N n
 yi Y   yi  y 
2 2

S2  i 1 s2  i 1
N 1 n 1

  N  1 S 2  N 2

 Theorem:
The sample mean y for a simple random sample of size n is an unbiased estimator of the
population mean Y . Symbolically, E  y   Y .
Proof:
n N  n  N
1
 y i p  y i 
n n n
 yi  E yi 
i 1  i 1 
   y i 
N  Y
  i 1  i 1 nY
y i 1  E  y   i 1    i 1  Y
n n n n n n

Yˆ  N y  
 E Yˆ  N E  y   N Y  Y

 Theorem:
In simple random sampling of n units without replacement from a population of N units, the
 N n S
2
variance of the sample mean is given by: var  y     .
 N  n
Proof:
2 2
2  n  n  n n
var  y   E  y  E  y    E  y  n Y   E   y i  n Y   E  y i  Y     E y i Y     E  y i  Y  y j  Y 
2 2
   
 i 1   i 1  i 1 i j

5
02 Simple Random Sampling

n
 var  y   n  2   E y i  Y   y j  Y 
i j

Now, consider the second term in the right hand side of the above equation. When the sampling
is drawn without replacement, the probability of obtaining x k on the i th draw is 1 and the
N
th 1
probability of obtaining xm on the j draw knowing that xk has been drawn is . Hence, the
N 1
1
probability of obtaining xk on the i th draw and xm on the j th draw is . Hence, the
N  N  1
second term implies to


E yi Y  y j  Y   E  y i y j   Y E  y i   Y E  y j   Y 2  E  y i y j   Y Y  Y Y  Y 2  E  y i y j   Y 2
N N

 E yi y j   Y 2   y i y j p  y i y j   Y 2   y i y j N  N1  1  Y 2
i j i j

 N   2
 2
N N
 N Y    y i  N  N  1 Y 
1   y    y 2   Y 1 2
 2
 2
N  N  1  i 1  i 1 i  N  N  1 
i
  i 1 
 N  
2
 yi  
 N 2 N

   
2
 
N
  y 2   i 1    
1 1 1 2
   y  NY     y Y
N  N  1  i 1 i  N  N  1  i 1 i N  N  N  1 i 1 i
 
 
 
1  2
  N 2
 
N  N  1 N 1
n n
 2
 2
  E  y i  Y  y j  Y     N 1
  n  n  1
N 1
i j i j

 y j  Y   n 2  n  n  1 N  1  n  2 1  Nn 11   n  2  N N1  n1  1   n  2  NN  1n 


n 2
 var  y   n  2   E y i  Y
i j

 N  n  2  N  n   N 1 2  N  n 
 var  y     n   n  S    nS
2
 N 1   N 1   N   N 

 y 1 1  N  n  2  N  n  2
 var  y   var    2 var  y   2   n   
n n n  N 1   N 1  n
1  N n 2  N n S
2
 var  y     n S   
n2 N   N  n

When the sampling is done with replacement, we are left with only the first term, since
E  y i  Y  y j  Y   0 and consequently,

2
var  y   n  2  var  y  
n

6
02 Simple Random Sampling

The variance of y above based on samples without replacement differs from those with
N n
replacement by the term . In other words, var  y  in sampling without replacement is only
N 1
N n
times its value in sampling with replacement.
N 1

N n n n
Provided that N is large compared with n ,  1  1 f , where f  and is less than 1 for
N 1 N N
any n such that 1 n  N . Therefore, the variance of y without replacement is less than the
 N  n   2 2
variance of y with replacement. That is,    .
 N 1  n n

In other words, for the same sample size, the simple estimator tends to vary less around the
population characteristics under sampling without replacement than that under with replacement.
In this sense, sampling without replacement should be preferred over sampling with replacement.

The factor 1 f is a correction factor for the finite size of the population and is called finite
n
population correction (fpc). The sampling fraction f  is small when either the sample is small
N
or the population is large. In either case, the factor 1  f approaches 1 and can be ignored. In such
cases, the variance of y does not depend on N and there is little or no practical difference
between the two methods.

 Theorem:
In simple random sampling of n units without replacement from a population of N units, the
sample variance  s 2  is an unbiased estimator of population variance  S 2  . That is, E  s 2   S 2 .
Proof:
1 n
  1 n 1 n  2

 yi  y   E yi  y    
E yi Y  y Y  
2 2
s2   E s 2  
n  1 i 1 n  1 i 1 n  1 i 1  

1 n 
 
n
 E y i  Y  
 n E  y  Y   2 E y i  Y   y  Y 
2 2
 E s 2 
n  1  i 1 i 1 
 n 
1
 
  E y i  Y  n E  y  Y   2 E  y  Y   n y  nY  
2 2

n  1  i 1 
1 n 2 1 n 2
 
 E y i  Y  n E  y  Y   2 n E  y  Y     E y i  Y    n Ey Y  
2 2 2

n  1  i 1  n  1  i 1 

 
E s2 
1 
n  1 
n  2  n var  y  


 N  n 
2
Now, for sampling without replacement, var  y     . So, we have from the above that
 N 1  n

1   N  n    
  N n  2 n N nN n
2 2
E s2  n n   n  
2
  
n  1   N  1  n  n  1  N 1  n 1  N 1 
  N  n  1 
2
 N  sin ce , N
    
2
S2 2
  N  1 S 2 
n 1  N 1   N 1  

7
02 Simple Random Sampling

 2
Again, for sampling with replacement, var  y   . So, we have from the above that
n
1    2 
 
E s2  n 2  n 
n  1  
 

1
n 2    2
  2
 n   n  1

In sampling without replacement, an unbiased estimator of var  y  is given by: v  y    1  f  2


s .
 n 
2
In sampling with replacement, an unbiased estimator of var  y  is given by: v  y   s .
n
 N n S 1 f  2 2
2
The variance of Yˆ  N y is given by:  
var Yˆ  var  N y   N 2 var  y   N 2  
 N  n

 n 
N S .

In sampling without replacement, an unbiased estimator of var Yˆ is given by: v Yˆ   1  f  N 2 s 2 .


   
 n 
2 2
In sampling with replacement, an unbiased estimator of var Yˆ  is given by: v Yˆ   N ns .

 Theorem:
The covariance between the sample means x and y in a simple random sample of n units from
1 f 
a population of N units without replacement is: cov  x , y     S xy . Also, the correlation
 n 
S xy
coefficient between x and y is: xy    xy . Here,
SxSy

1 N  N 
  
S xy  E x i  X E y i  Y     
 x i  X E y i  Y   N  1 
N  1 i 1
 xy

Proof:
Let u i  x i  y i so that u xy. The corresponding population mean of ui is U  X Y . So, we
have that
1 f  2 1 f  1 1 f  1
N N
E  u  U   var  u     u i U    xi  yi  X Y  
2 2 2
S u     
 n   n  N  1 i 1  n  N  1 i 1
1 f  1    1  f   S 2  S 2  2cov x, y 
N 2
 
 n  N  1 i 1 

  xi  X  yi Y       x
 n  
y   

Now, we have that
2
E  u  U   E  x  y  X  Y   E  x  X    y  Y    E  x  X   E  y  Y   2 E  x  X  y  Y 
2 2 2 2
 
1 f  2
 var  x   var  y   2cov  x , y   
 n 

 S x  S y  2cov  x , y 
2

So, we have from the above that

1 f  2

 n 

 S x S y
2
  2cov  x , y    1 n f  S x2  S y2  2cov  x, y  
1 f  1 f 
 cov  x , y     cov  x, y     S xy
 n   n 
Again, we have that

8
02 Simple Random Sampling

1 f 
cov  x , y    S xy S xy
xy    n     xy
var  x  var  y  1 f  2 1 f  2
SxSy
 S   Sy
 n  x  n 
Thus, in simple random sampling, the correlation coefficient between the sample means is
independent of the sample size and is equal to the correlation between individual observations.
 Relative error of the estimators:
In sampling theory, standard errors serve as absolute measures of precisions of sample
estimators. The errors in the sample estimators can also be assessed in relative terms. We call
this measure the coefficient of variation. In simple random sampling, the coefficient of variation
for the sample mean is given by:
1 f  2
var  y   S y S y 1 f 
 n  1 f 
CV  y        CV  y   
Y Y Y  n   n 
 CV  y    CV  y  
2 2
1 f 1 
         for l arg e N 
 CV   
y n  CV    n 
y 
Again, we have that
var Yˆ   var  N y  var  y 
 
CV Yˆ 
Y Y

Y
 CV  y 

So, we can say that in simple random sampling, the coefficient of variation of an estimated total
is the same as that of an estimated mean.

 Advantages of simple random sampling:


1) It is simple to conceptualize.
2) It provides foundation for much of statistical theory.
3) It provides a basis to which other methods can be compared.
4) Since, the sampling units are selected at random giving each unit an equal chance
of being selected, the element of subjectivity or personal bias is completely
eliminated.

 Disadvantages of simple random sampling:


1) It requires an up-to-date frame from which samples are to be drawn. So, all units
in the population must be identified and labeled prior to sampling. This process is
potentially so expensive and time consuming that it becomes unrealistic to
implement.
2) Sampled individuals may be so widely dispersed that visiting each selected
individual may be extremely expensive and time consuming.
3) Certain subgroups in the population may be totally overlooked or may be over
represented in the sample as a result of chance factor. In either case, the estimated
parameters are likely to be in error.
4) When the population measurements vary considerably in size, then simple
random sample produces larger variances than other methods of sampling.

9
02 Simple Random Sampling

10

You might also like