0% found this document useful (0 votes)
54 views

Probability Sampling

The document discusses probability sampling methods for research. It defines key concepts like population, which is the total group being studied, and sample, which is a subset of the population. Probability sampling aims to select a random and representative sample where every member of the population has an equal chance of being chosen. Specific probability sampling methods covered include simple random sampling, where participants are randomly selected from the population list.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
54 views

Probability Sampling

The document discusses probability sampling methods for research. It defines key concepts like population, which is the total group being studied, and sample, which is a subset of the population. Probability sampling aims to select a random and representative sample where every member of the population has an equal chance of being chosen. Specific probability sampling methods covered include simple random sampling, where participants are randomly selected from the population list.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 17

PROBABILITY SAMPLING

INTRODUCTION

 In order to carry out a research study, you have to first acquire relevant information on
the subject.
 In other words, you have to collect data.
 This data is required to test your ‘hypotheses’ or generalizations that you have made for
the time being.
 Let us suppose that as a researcher, you want to look into the relationships between study
habits and achievement motivation of undergraduate Students of Nursing. For this, you
have to select a few representative cases or samples from the entire population of
undergraduate students of Nursing.
 The process of selection demands thorough understanding of the concept of population,
sample and various sampling techniques.
 We shall familiarize you with the concepts of sample and population.
 We shall also discuss the characteristics of a good sample and the various methods of
sampling.

CONCEPT OF POPULATION AND SAMPLE

 A “sample” is a miniature representation of and selected from a larger group or aggregate.


 In other words, the sample provides a specimen picture of a larger whole.
 This larger whole is termed as the “population” or “universe”.
 In research, this term is used in a broader sense; it is a well defined group that may
consist of individuals, objects, characteristics of human beings, or even the behaviour of
inanimate objects, such as, the throw of a dice or the tossing of a coin.
 It is not possible to include all units of a population in a study in order to arrive at a valid
conclusion.
 Moreover, the sizes of populations are often so large that the study of all the units would
not only be expensive but also cumbersome and time consuming.
 For example, there are more than fifty thousand undergraduate students in Nursing.

1
 For our research, it is impossible to collect information about the study habits of all these
students.
 So, for the survey a researcher will have to select a representative few, i.e., a sample from
the population. This process is known as sampling.
 If the nature of the population has to be inferred from a sample, it is necessary for the
sample to be truly representative of the population.
 Moreover, it calls for drawing a representative ‘proportion’ of the population.
 The population may contain a finite number of members or units.
 Sometimes, the population may be ‘infinite’ as in the case of air pressure at various points
in the atmosphere.
 Therefore, a population has to be defined clearly so that there is no ambiguity as to
whether a given unit belongs to the population or not.
 Otherwise, a researcher will not know what units to consider for selecting a sample.
 For example, we want to understand the study habits of distance education students.
 Here, the population is not well defined : we are not told about the university/ universities
that have to be included in this survey. After all, there are more than hundred universities
in India, that provide distance education and there are thirteen state open universities.
 Hence, to define it accurately, we have to specify the group as, say, undergraduate
students of Nursing.
 The second issue related to the representativeness of a sample is to decide about the
‘sampling frame’, i.e., listing of all the units of the population in separate categories.
 In the above study, there can be different sampling frames, such as male/female students,
employed/unemployed students, etc.
 The sampling frame should be complete, accurate and up-to-date, and must be drawn
before selecting the sample.
 Thirdly, a sample should be unbiased and objective.
 Ideally, it should provide all information about the population from which it has been
drawn. Such a sample, based on the logic of induction, i.e., proceeding from the
particular to the general, falls within the range of random sampling errors. This leads us
to the results expressed in terms of “probability”.

2
 A sample should not only be representative , but should also be adequate enough to
render stability to its characteristics.
 What, then, is the ideal size of a sample?
 An adequate sample is the one that contains enough cases to ensure reliable results.
 If the population under study is homogeneous, a small sample is sufficient.
 However, a much larger sample is necessary, if there is greater variability in the units of
population.
 Thus the procedure of determining the sample size varies with the nature of the
characteristics under study and their distribution in the population.
 Moreover, the adequacy of a sample will depend on our knowledge of the population as
well as on the method used in drawing the sample.
 For example, if we try to find out the study habits of undergraduate students of Lady
Irwin College, Delhi, the population will obviously be more homogeneous than the
population of undergraduate students of Nursing, with respect to socio-economic status,
employment of students or study hours available.
 However, it should be understood that the adequate size of the sample does not
automatically ensure accuracy of results.

METHODS OF SAMPLING

Sampling methods can be broadly classified into two categories:

i) Probability sampling
ii) Non-probability sampling

PROBABILITY SAMPLING

 Probability sampling is based on random selection of units from a population.


 In other words, the sampling process is not based on the discretion of the researcher but is
carried out in such a way that the probability of every unit in the population of being
included is the same.
 For example, in the case of a lottery, every individual has equal chance of being selected.
Some of the characteristics of a probability sample are :
 each unit in the population has some probability of being selected in the sample
3
 weights appropriate to the probabilities are used in the analysis of the sample and
 the process of sampling is automatic in one or more steps of the selection of units
in the sample.
 Probability sampling can be done through different methods, each method having its own
strengths and limitations.

Simple or unrestricted random sampling

 Simple random sampling is a method of selecting a sample from a finite population in


such a way that every unit of the population is given an equal chance of being selected. In
practice, you can draw a simple random sample unit by unit through the following steps:
a. Define the population
b. Make a list of all the units in the population and number them from 1 to n
c. Decide the size of the sample, or the number of units to be included in the sample
d. Use either the ‘lottery method’ or ‘random number tables’ to pick the units to be
included in the sample.
 For example, you may use the lottery method to draw a random sample by using a set of
‘n’ tickets, with numbers ‘1 to n’ if there are ‘n’ units in the population.
 After shuffling the tickets thoroughly, the sample of a required size, say x, is selected by
picking the required x number of tickets.
 The units which have the serial numbers occurring on these tickets will be considered
selected.
 The assumption underlying this method is that the tickets are shuffled so that the
population can be regarded as arranged randomly.
 Similarly, while selecting 500 students from the total population of 50000 undergraduate
students of Nursing , you will write the roll numbers of all the students on small pieces of
paper.
 Jumble the chits well and then choose five hundred roll numbers.
 The best method of drawing a simple random sample is to use a table of random
numbers.
 These random number tables have been prepared.

4
 Fisher and Yates (1967), After assigning consecutive numbers to the units of population,
the researcher starts at any point on the table of random numbers and reads the
consecutive numbers in any direction horizontally, vertically or diagonally.
 If the read-out number corresponds with the one written on a unit card, then that unit is
chosen for the sample.
 Let us, suppose that a sample of 5 study centers is to be selected at random from a
serially numbered population of 60 study centers. Using a part of a table of random
numbers reproduced here, five two-digit numbers (as the total population of study
centers, 60, is a two-digit figure) are selected.

Row 1 2 3 4 5 --- n
Column
1 2315 7548 5901 8372 5993 --- 6744
2 0554 5550 4310 5374 3508 --- 1343
3 1487 1603 5032 4043 6223 --- 0834
4 3897 6749 5094 0517 5853 --- 1695
5 9731 2617 1899 7553 0870 --- 0510
6 1174 2693 8144 3393 0862 --- 6850
7 4336 1288 5911 0164 5623 --- 4036
8 9380 6204 7833 2680 4491 --- 2571
9 4954 0131 8108 4298 4187 --- 9527
10 3676 8726 3337 9482 1569 --- 3880
11 --- --- --- --- --- --- ---
12 --- --- --- --- --- --- ---
n 3914 5218 3587 4855 4881 --- 5042

 If you start with the first row and the first column, 23 is the first two-digit number, 05
is the next number and so on.
 Any point can be selected to start with the random numbers for drawing the desired
sample size.

5
 Suppose the researcher selects column 4 from row 1, the number to start with 83.
 In this way he/she can select first 5 numbers from this column starting with 83. The
sample, then, is as follows:

83 75

53√ 33√

40√ 01√

05√ 26√

Tools for research

 Now, in selecting the sample of 5 study centers, two numbers, 83 and 75, need to be
deleted as they are bigger than 60, the size of the population.
 The processes of selection and deletion are stopped after the required number of five
units get selected.
 The selected numbers are 53, 40, 05, 33 and 01. If any number is repeated in the table, it
may be substituted by the next number from the same column.
 The researcher will go on to the next column until a sample of the desired size is
obtained.
 Simple random sampling, ensures the best results. However, from a practical point of
view, a list of all the units of a population is not possible to obtain.
 Even if it is possible, it may involve a very high cost which a researcher or an
organization may not be able to afford.
 Therefore, simple random sampling is difficult to realize. Also, in case of a
heterogeneous population, a simple random sample may not necessarily represent the
characteristics of the total population, even though all selected units participate in the
investigation.

6
 In the case of undergraduate students of the Open University in your country (assuming
you have one), students may be employed in different sectors and categories of
services/industries.
 In spite of your best efforts you may not be able to list all the categories of employment.
 In such a case, simple random sampling cannot help in representing all the categories
under study.

Systemic sampling

Systematic sampling provides a more even spread of the sample over the population list and
leads to greater precision. The process involves the following steps:

a. Make a list of the population units based on some order - alphabetical, seniority, street
number, house number or any such factor.
b. Determine the desired sampling fraction, say 50 out of 1000; and also the number of the
Kth unit. [K=N/n= 1000/50 = 20].
c. Starting with a randomly chosen number between 1 and K, both inclusive, select every
Kth unit from the list. If in the above example the randomly chosen number is 4, the
sample shall include the 4th, 24th, 44th, 64th, 84th units in each of the series going up to
the 984th unit.
 This method provides a sample as good as a simple random sample and is comparatively
easier to draw.
 If a researcher is interested to study the average telephone bill of an area in his/her city,
he/she may randomly select every fourth telephone holder from the telephone directory
and find out their annual telephone bills.
 However, this method suffers from the following drawbacks because of departure from
randomness in the arrangement of the population units.
i) Periodic effects
 Populations with more or less definite periodic trend are quite common.
 Students’ attendance at a residential university library open seven days in
a week, sales of a store over twelve months in a year and flow of road
traffic past a particular traffic point on a road over 24 hours are a few
examples to show periodic trend or cyclic fluctuation in a given

7
population. In such cases systematic sample may not represent the
population adequately or remain effective all the time.
ii) Trend
 Another handicap of systematic sampling emerges from the fact that very
often ‘n’ is not an integral multiple of ‘k’.
 This leads to a varying number of units in the sample from the same finite
population.
 Suppose a population of 100 counsellors is listed according to seniority
and a researcher wants to select a sample of 20.
 First, he/she divides 100 by 20 to get 5 as the size of the interval.
 Suppose he/she picks 4 at random from 1 to 5 as a starting number.
 Then, he/she selects each 5th name at 9,14,19.... until he/she draws the
desired 20 names.
 If he/she picks 2 as the starting point, another sample would consist
2,7,12.... In the latter sample each counsellors seniority is lower than
his/her counterpart in the former sample.
 The mean average of these two samples would be significantly divergent
as regards seniority and other associated variables.
 Many such samples can be drawn by taking different starting points but
there will be greater variation among them.
 Thus, the ‘periodic effects’ and ‘trend’ of the listed population unduly
increase the variability of the samples, and calculations made from such
samples cannot show the sources of variability.
 The main advantages of systematic sampling are:
a) It involves simple calculations.
b) It is less expensive than random sampling.

8
Stratified Random Sampling

 In some cases, the population to be sampled is not homogenous.


 Therefore, rather than selecting randomly from the entire population the main population
is divided into a number of sub-populations called strata, each of which is homogeneous
with respect to one or more characteristic(s).
 The sample elements are then selected from each stratum at random.
 Thus, all strata are represented in the sample.
 This approach to sampling is called stratified random sampling because the population is
stratified into its sub-populations and the condition of random selection is included by the
selection within the strata.
 The steps involved in the stratified sampling are given as follows:
i) Deciding upon the relevant stratification criteria such as sex, geographical region,
age, courses of study, etc.
ii) Dividing the total population into sub-populations based on the stratification
criteria.
iii) Listing the units separately in each sub-population.
iv) Selecting the requisite number of units from each sub-population by using an
appropriate random selection technique.
v) Consolidating the sub-samples for making the main sample
 Thus, stratification improves the representativeness of a sample by introducing a
secondary element of control.
 However, the efficiency of the stratified random sample depends on the allocation of
sample size to the strata.
 There are three types of allocation in stratified random sampling.

Tools for Research

1) Equal Allocation
 In this type, all strata contribute the same number of sampling elements to the
sample.

9
 Thus, if there are three strata , one third of the sample would be selected from
each stratum. This type of allocation is done when strata have equal population.
2) Proportional Allocation
 In this type, all strata contribute to the sample a number that is proportional to its
size in the population.
 The larger the stratum , the more members it contributes to the sample .
 The sampling fraction remains constant .
 Suppose there are five strata to be sampled and the respective population sizes of
the strata are as follows and 5% stratified random sample is to be selected.
 The proportional allocation will be done as follows:

Strata Strata Sizes Sample size by Strata


I 5000 250
II 1800 90
III 2000 100
IV 3500 175
V 450 23(22.5)
N=12750
Sample size=638(638.5 rounded off)

3) Optimum Allocation
 In optimum allocation, the strata contributions to the sample are proportional to
the product of the strata population sizes and the variability of the dependent
variable within the strata.
 Large strata and strata with large variability will have larger contributions to the
sample.
 Because of the requirement of good estimates of population variability of
dependent variable, which is seldom available before the sample is selected, The
optimum allocation is used infrequently.
 Stratified random sample is useful when lists of units or individuals in the
population are not available.

10
 It is also useful in providing more accurate results than simple random sampling.
 For example, while selecting a sample of undergraduate students of the Open
University in your country, the researcher may decide the whole population of
undergraduate students as males and females, north, east, south and west regions
of the country and then employed in government, private and autonomous
institutions in the country.
 All these will be different strata. From each stratum researcher may select 50
students as a sample.
 Sometimes stratification is not possible before collecting the data.
 The stratum to which a unit belongs may not be known until the researcher has
actually conducted the survey.
 Personal characteristics such as sex, social class, educational level, age etc., are
examples of such stratification criteria.
 The procedure in such situations involves taking of a random sample of the
required size and then classifying the units into various strata.
 The method is quite efficient provided the sample is reasonably large, i.e., more
than 20 in every stratum.

Cluster sampling

 Cluster sampling is used when the population under study is infinite, where a list of units
of population does not exist, when the geographic distribution of units is scattered, or
when sampling of individual units is not convenient for several administrative reasons.
 It involves division of the population into clusters that serve as primary sampling units.
 A selection of the clusters is then made to form the sample.
 Thus, in cluster sampling, the sampling unit contains clusters instead of individual
members or items in the population.
 For example, for the purpose of selecting a sample of high school teachers in a state, you
may enlist all high schools instead of teachers teaching in high schools and select
randomly a 10 per cent sample (say) of the schools as clusters.
 You may then use all the teachers of the selected schools as the sample or randomly select
a few of them.

11
 Any location within which we find an intact group of similar characteristics (population
members) is termed as a cluster.
 Examples of cluster include classrooms, schools, hospitals, and study centers .
 Cluster sampling is economic, especially when the cost of measuring a unit is relatively
small and cost of reaching it is relatively large.

Multistage sampling

 Multi-stage sampling is used in large scale surveys for a more comprehensive


investigation.
 The researcher may have to use two, three or even four stage sampling.
 For example, in surveys mailed questionnaires are generally used to gather information
from people living in widely scattered areas.
 Although the method is cost effective, partially completed questionnaires may introduce a
bias due to which a representative sample cannot be obtained.
 To overcome this bias, two-stage sampling has to be used.
 A second sample from non-respondents is selected at random by contacting them
personally.
 In this way the consistency of the data obtained from the first sample can also be verified.
 Similarly, if a researcher goes for a national survey of counsellors, he/she can draw a
sample of five states representing northern, eastern, southern, western and central
regions. From these five states, all the districts can be enlisted out of which a sample of
30 to 40 districts can be drawn randomly.
 Out of this, all the study centers in different districts can be enumerated.
 A random sample of about 300 to 400 study centers is then drawn.
 Further, a random sample of about 1500-2000 counsellors are drawn for the survey.
 The successive random sampling of states, districts, study centers and finally counsellors
also provide a multi-stage sample.
 Multi-stage sampling is advantageous as the burden on the respondents is lessened, it is
cost effective, time saving and efficient in formulating the sub-sample data.

12
 However, this method is recommended only when it seems impractical to draw a simple
random sample.

Tools for Research

 When the units vary in size, it is better to select a sample in such a way that the
probability of selection of units is proportional to its size.
 For example, a particular study center has a population of 200 learners and another one
has 100.
 While drawing a sample, the first study center will have double the representation as
compared to the second study center. Such a sample is known as probability proportion to
size sample or PPS sample.

Using Computer for Sample Selection

 There are a number of websites that will generate random numbers for you .
 For e.g., website www.randomizer.org is very easy to use.
 On opening this website you will have to answer a series of questions such as how many
sets of random numbers to be generated; how many numbers per set to be produced ;
number range etc.
 Many software packages include programmes for selecting a random sample.
 One such package is Statistical Package for Social Sciences (SPSS) for Windows 15.0
(SPSS, Inc.,2006). SPSS has two options for specifying the size of random sample:
a) Exactly
b) Approximately
 Exactly, as the name suggests, requires exact/specific number like 600 from 2000 Class
IX students listed .
 Whereas the second option specifies the sampling fraction i.e. the ratio of sample size to
population size, e.g. 30 percent of all the Class IX students could be selected.
 A number of other software packages are also available that provide the scope for the
selection of a random sample other than a simple random sample.

13
SUMMARY

The briefing of Probability Sampling includes; (1) concept of Sampling (2) concept of
population (3) methods of sampling : PROBABILITY SAMPLING. (4) Research article for
evidence.

14
CONCLUSION

Probability sampling leads to higher-quality findings because it provides an unbiased population


representation. When the population is usually diverse researcher use this method extensively as
it helps them create sample that fully represent the population. A probability sample ensures that
each element within the population of interest has a known change of being chosen.

15
RESEARCH ARTICLE

Debashis rout (2019) conducted a study on Data mining ( towards sampling) is a key subject
on discovering various dimensions of unpolished data that is crucial for data extraction which
will be used for data analysis. One major area of data mining is the data pre-processing. In this
busy world it’s impossible to find the exact data when we plan for any data analysis using raw
data for a large population. It is wrong to believe that core data is enough to be used directly
from different sources. There are many reasons why it’s happening, because of incomplete information,
containing noisy data, duplication of data, as well as too inconsistent data. It creates a major
impact when we try to take any important decision based on data analysis. So, data preprocessing
plays a major role in business intelligence to validate raw data to prepare quality data. Data
preprocessing is a vast area, which consist of various strategies and methods being interrelated.
This article going to discuss about one of the important techniques under preprocessing named
sampling and comparison of various sampling procedures.

16
REFERENCES

Fisher, R. A., and Yates, F. (1967) Statistical Tables for Biological, Agricultural and Medical
Research, London, Oliver and Boyd.
Krejcie, Robert V., and Morgan Daryle W. (1970) Determining Small Size for Research
Activities in Educational and Psychological Measurement, 30, 607-610.
Enki-Village. (2019). What Is Purposive Sample? When and How to Use It. [online] Available at:
https://www.enkivillage.org/purposive-sampling.html [Accessed 19 Apr. 2019].
Abedor, Handbook on Improving Sampling Methods.

Knuth, Donald E., TEX, a System for Technical Text, American Mathematical society.

17

You might also like