Some Definitions: Selection. in Order To Have A Random Selection Method, You Must Set Up Some Process or
Some Definitions: Selection. in Order To Have A Random Selection Method, You Must Set Up Some Process or
selection. In order to have a random selection method, you must set up some process or
procedure that assures that the different units in your population have equal probabilities of
being chosen. Humans have long practiced various forms of random selection, such as picking
a name out of a hat, or choosing the short straw. These days, we tend to use computers as the
mechanism for generating random numbers as the basis for random selection.
Some Definitions
Before I can explain the various probability methods we have to define
some basic terms. These are:
That's it. With those terms defined we can begin to define the different
probability sampling methods.
For this to work, it is essential that the units in the population are
randomly ordered, at least with respect to the characteristics you are
measuring. Why would you ever want to use systematic random
sampling? For one thing, it is fairly easy to do. You only have to select a
single random number to start things off. It may also be more precise
than simple random sampling. Finally, in some situations there is simply
no easier way to do random sampling. For instance, I once had to do a
study that involved sampling from all the books in a library. Once
selected, I would have to go to the shelf, locate the book, and record
when it last circulated. I knew that I had a fairly good sampling frame in
the form of the shelf list (which is a card catalog where the entries are
arranged in the order they occur on the shelf). To do a simple random
sample, I could have estimated the total number of books and generated
random numbers to draw the sample; but how would I find book #74,329
easily if that is the number I selected? I couldn't very well count the cards
until I came to 74,329! Stratifying wouldn't solve that problem either. For
instance, I could have stratified by card catalog drawer and drawn a
simple random sample within each drawer. But I'd still be stuck counting
cards. Instead, I did a systematic random sample. I estimated the
number of books in the entire collection. Let's imagine it was 100,000. I
decided that I wanted to take a sample of 1000 for a sampling fraction of
1000/100,000 = 1%. To get the sampling interval k, I divided N/n =
100,000/1000 = 100. Then I selected a random integer between 1 and
100. Let's say I got 57. Next I did a little side study to determine how
thick a thousand cards are in the card catalog (taking into account the
varying ages of the cards). Let's say that on average I found that two
cards that were separated by 100 cards were about .75 inches apart in
the catalog drawer. That information gave me everything I needed to
draw the sample. I counted to the 57th by hand and recorded the book
information. Then, I took a compass. (Remember those from your high-
school math class? They're the funny little metal instruments with a
sharp pin on one end and a pencil on the other that you used to draw
circles in geometry class.) Then I set the compass at .75", stuck the pin
end in at the 57th card and pointed with the pencil end to the next card
(approximately 100 books away). In this way, I approximated selecting
the 157th, 257th, 357th, and so on. I was able to accomplish the entire
selection procedure in very little time using this systematic random
sampling approach. I'd probably still be there counting cards if I'd tried
another random sampling method. (Okay, so I have no life. I got
compensated nicely, I don't mind saying, for coming up with this
scheme.)
For instance,
in the figure
we see a map
of the
counties in
New York
State. Let's
say that we
have to do a
survey of
town
governments
that will
require us
going to the
towns
personally. If we do a simple random sample state-wide we'll have to
cover the entire state geographically. Instead, we decide to do a cluster
sampling of five counties (marked in red in the figure). Once these are
selected, we go to every town government in the five areas. Clearly this
strategy will help us to economize on our mileage. Cluster or area
sampling, then, is useful in situations like this, and is done primarily for
efficiency of administration. Note also, that we probably don't have to
worry about using this approach if we are conducting a mail or telephone
survey because it doesn't matter as much (or cost more or raise
inefficiency) where we call or send letters to.
Multi-Stage Sampling
The four methods we've covered so far -- simple, stratified, systematic
and cluster -- are the simplest random sampling strategies. In most real
applied social research, we would use sampling methods that are
considerably more complex than these simple variations. The most
important principle here is that we can combine the simple methods
described earlier in a variety of useful ways that help us address our
sampling needs in the most efficient and effective manner possible.
When we combine sampling methods, we call this multi-stage
sampling.
For example, consider the idea of sampling New York State residents for
face-to-face interviews. Clearly we would want to do some type of cluster
sampling as the first stage of the process. We might sample townships
or census tracts throughout the state. But in cluster sampling we would
then go on to measure everyone in the clusters we select. Even if we are
sampling census tracts we may not be able to measure everyone who is
in the census tract. So, we might set up a stratified sampling process
within the clusters. In this case, we would have a two-stage sampling
process with stratified samples within cluster samples. Or, consider the
problem of sampling students in grade schools. We might begin with a
national sample of school districts stratified by economics and
educational level. Within selected districts, we might do a simple random
sample of schools. Within schools, we might do a simple random sample
of classes or grades. And, within classes, we might even do a simple
random sample of students. In this case, we have three or four stages in
the sampling process and we use both stratified and simple random
sampling. By combining different sampling methods we are able to
achieve a rich variety of probabilistic sampling methods that can be used
in a wide range of social research contexts.