SRME Lecture 8 Notes
SRME Lecture 8 Notes
Lecture 8: Sampling
Covered Areas:
I. Research Universe
II. Sample
III. Sampling Types
IV. Sampling Process
V. Sampling Errors
Objectives:
To be able to define and explain the concept of research universe,
To be able to define and explain the concepts of sampling and sampling,
To be able to evaluate why sampling is needed and the importance of sampling,
To be able to explain sampling types and pros/cons of them,
To be able to explain the main stages of the sampling process,
To Know sampling errors and be able to interpret their cause.
I. RESEARCH UNIVERSE
In its most general definition, the research universe is the large set consisting of all the observation
units that can provide data on the subject of interest to the researcher.
In scientific research, it is mostly preferred to reach a small group selected from the main mass, that is,
the sample, instead of all units and to have information about the main mass by taking data from there.
In broad terms, a sample is a group selected from a certain universe to represent this universe within
some rules.
When certain conditions are met, the results obtained by conducting research on the sample can be
generalized to the main population.
The universe is the set of elements in which the results of the research are intended to be generalized.
Every research has a universe.
The size of the universe depends on the subject and purpose of the research and what the
generalizations will cover. A population that is suitable for one research may not be suitable for another
research.
The identification and delimitation of the universe is an activity that takes place entirely in line with the
purpose and at the request of the researcher.
For example, in a study, the main population was determined as people living in a region; In another
study, people living in this region can be limited to only those under a certain age, a certain gender, and
a certain socioeconomic level.
In some cases, it is not possible to reach the entire universe.
For example, if a factory wants to test the durability of its product under certain conditions, it would not
make sense to subject all products to such a test.
In another example, it will be very difficult to reach all companies in a country that spread over a very
large region geographically.
II. SAMPLE
In general terms, a sample is a group selected from a certain universe to represent this universe within
some rules. The results obtained by conducting research on the sample are generalized to the main
population.
Working on the sample allows the researcher to get to know the main population at less cost; Because
sampling saves the researcher time, energy, and money. In addition, the purpose of research is not to
collect a lot of data, but to collect valid and reliable data.
The large set containing all the observation units is called the universe, and the small set containing
some of these units and created to represent the main mass is called the sample.
The list containing the units in the population is called the sampling frame.
In order for the sample to be selected, the researcher must first have a list of the universe he wants to
examine. The list containing the units in the population is called the sampling frame.
Without a frame, proper sampling cannot be done.
Random sampling involves methods in which the probability of selection of sampling units in the
population is known and this probability is not zero.
If no privilege is taken among the units in the population when choosing to create the sample, if all of
them are given an equal chance of being selected, random selection is conducted.
In non-random sampling, the probability of being selected is not clear.
In other words, when creating the sample, the difference between the units in the population is taken into
account and these units are not given an equal chance to be selected for sampling.
In non-random opinions and subjective value judgments of the researchers are effective in determining
the units to be selected for sampling.
Random Sampling Methods (Probability Sampling):
Random sampling should be conducted to test the information obtained from the sample with
mathematical and statistical techniques and generalize it to the main population.
There are various sampling methods, including simple random sampling, systematic sampling, stratified
sampling, cluster sampling, and multi-stage sampling.
1. Simple Random Sampling
In this method, which is the most basic form of random sampling methods, each unit selected for
sampling is determined randomly. Here, each observation unit in the main body has an equal chance of
entering the sample.
In order to perform simple random sampling, the properties of the units in the main mass relevant to the
research subject must be homogeneous, in other words, similar.
Let's assume that we are going to create a sample by selecting 300 people from a company with 1500
employees through simple random sampling. In order to create the sample, we must first have a list of
the names of all employees. Let's obtain such a list from the company's human resources department.
The next thing to do would be to give all the employees on this list a sequence number, starting from 1 to
1500.
2. Systematic Sampling
When the sampling frame cannot be fully established or there are random errors in the list, systematic
sampling is used instead of simple random sampling.
In cases where the population is very large, systematic sampling may be preferred, as it will take a very
long time to use the simple random sampling method.
To use the systematic sampling method, the researcher must first determine the sample size and thus
calculate what percentage of the population will enter the sample. The first unit is randomly selected
from the sampling frame, and each subsequent kth unit of sampling is selected.
In systematic sampling, all units are equally likely to be selected for sampling; However, as with simple
random sampling, not every sample of the same size has an equal chance of being selected; Because
when the first unit to be sampled is selected, the ones that will enter the sampling from the rest emerge
automatically.
Example: Let's assume that we decide to conduct a survey on a certain subject a sample to be taken
from a university with 8000 students. First order of business is to list the names of all of these students,
giving each name on the list a sequence number, starting with 0. We want to select a sample of 5% of
this population, i.e. 400 students. If we have decided to use systematic sampling in the selection of 400
students, we must first calculate the sample range found by dividing the sample size by the size of the
universe. In this example, our sample range is 400/8000 = 1/20. In other words, the sample size is one-
20th of the main mass. We randomly select one of the first 20 students and then select the students
corresponding to these numbers from the list, skipping 20 numbers each time.
Particular attention should be paid to when using this method is that no systematic error is made in
sampling.
Example: Let's imagine that we systematically select 20 houses from 200 houses on a street where the
houses on both sides are heated with natural gas. All of these houses will be either odd or even
numbered; Because if the first unit selected for sampling is odd, it will cause all the units that will come
after it to be odd, and if the first unit is even, it will cause all the remaining units to be even. Since the
house numbers are odd in the houses on one side of the street and even in the houses on the other side
of the street, the houses in the sample will always be on the same side of the street. If one of these
directions is located on the sunlit side of the street and the other is not, a survey to be carried out on
houses in only one direction cannot accurately and objectively reveal the level of heating with natural
gas.
3. Stratified Sampling
It aims to make the sampling process more effective by using the available information about the
population. In this method, first of all, all units in the population are subjected to a distinction according to
the main daaracteristics that are determined according to the researcher's purpose.
With this separation, the layers in the main mass occur. The purpose of layering is to increase the
similarity of the selected sample with the main body of the research as much as possible, thus
minimizing sampling error by increasing the power of the sample to represent the main mass.
In stratified sampling, the population is divided into layers, with each unit in the population belonging to
only one layer.
In layering, characteristics such as income level, gender, age, region, department, product type, industry
type, etc. are mostly used. For example, in a market research, the target customer group can be
stratified in terms of characteristics such as gender, age, social class, etc.
After the layers are created, a random sample is taken in proportion to the number of individuals in each
layer, and thus three separate samples are selected.
4. Cluster Sampling
If there is no list of units that make up the population, or if the units to be randomly selected individually
are spread over a very large geographical area (scattered), cluster sampling can be applied.
The basis of the cluster sampling method is the idea that each of the units in the population is not
sampled individually, but with the clusters they create.
It forms clusters of groups that arise spontaneously, in which the units in the population come together in
a mixed way. In this method, the main mass is divided into a certain number of clusters in terms of
geographical features, not in terms of any feature of the observation units.
For example, families living in the same street and neighborhood, or students studying at the same
school, teachers working in the same school, and industrial enterprises in the Marmara region are
counted as clusters.
Cluster sampling is a method used to facilitate the sampling process rather than increasing the sensitivity
of the sample. However, there is a significant downside to the method. In simple random sampling, each
unit in the population is given equal chance of being selected, whereas here only the clusters are given
an equal chance of being selected. If the units In the set are related to each other, which is usually the
case, systematic error will occur.
5. Multistage Sampling
Combines several sampling methods, often involving clusters followed by random or systematic
sampling within those clusters.
Example: Selecting districts first, then schools, and finally students
Non-Random Sampling Methods (Non-Probability Sampling):
1. Convenience Sampling
This sampling method has the lowest cost and easiest to implement.
The researcher chooses the sampling of the units in the population that they can reach most easily. In
other words, the researcher has the freedom to sample whomever he wants.
Until the researcher reaches a group of the size she needs, she starts to create her sample, starting with
the most accessible respondents, or she creates his sample in the most accessible and most low-cost
way.
Convenience sampling presents some findings to the researcher, but there is no possibility to generalize
these findings; Because it is not known which main population the selected sample represents.
3. Snowball Sampling
In the snowball sampling method, first of all, one unit is reached from the population that is difficult to
reach, then another unit is reached with the help of that unit, and then other units are reached with the
help of them to reach the targeted sample size and diversity.
If the researcher feels that he is directed towards different places than his goal or if he cannot reach the
sample he wants to reach, he should try to reach the targeted sample by trying to contact different units.
4. Quota Sampling
Quotas are determined in order to include some prominent features in the research universe in the
sample and a sample is created accordingly.
In line with the purpose of the research, the population is divided into layers according to various criteria
such as geographical region, gender, age, social class, and a sample is created by selecting units from
each layer according to the ratio of that layer in the main mass.
The arbitrary selection of the units to be selected from the layers on the basis of convenience causes
this method not to be accidental.
In this method, the level of representation of the sample is higher than other non-random methods;
Because while sampling, the features that are important for the research are taken into account and care
is taken to ensure that these features are included in the sample in a similar way to those in the main
mass.
For example, suppose a researcher wants to select 1,000 people from a population of 50,000 by quota
sampling. Suppose there are 10,000 females and 40,000 males in this mainland. If this researcher wants
to represent the ratio of women and men in the main mass in the sample, he should create his sample
by selecting 200 women and 800 men.
In quota sampling, researchers divide the population into subgroups (e.g., age, gender, income level)
and set a quota for each subgroup. The selection of participants continues until the quota for each group
is filled, often without randomization.
Purpose: To ensure that the sample represents certain characteristics of the population.
Example: A study might set quotas to include 50% men and 50% women or 20 respondents from each
income bracket.
5. Volunteer Sampling
Participants self-select into the study by volunteering.
Example: People responding to an online survey invitation.
5. Selection of sampling units: After passing through the above-mentioned stages, the sampling
process is completed by selecting the observation units from the determined sample.
V. SAMPLING ERRORS
Sampling errors can basically be examined in two groups.
The errors in the first groups are called Random errors based on the idea that the sample used in the
study will be somehow different from the population from which it was selected. Accidental error refers to
an almost inevitable fact; Because only a part of the population is taken into account in each sampling.
One of the most important reasons that increase random error is that the selected sample does not fully
represent the main population, but only represents a part or a small proportion of it (Boke, 2009:137-
138). This type of error can be eliminated by increasing the sample size.
The errors in the second group are called systematic errors. Systematic error arises from errors made
during the sampling process and cannot be corrected later. The sources of such errors are:
1. Incorrect selection of the sampling method
2. Incorrect identification of the population
3. Incorrect determination of the sampling frame
4. Incorrect drawing of the units to be included in the sample
5. Incorrect calculation of sample size (Islamoglu, 2009:172).