0% found this document useful (0 votes)
14 views51 pages

Chapter 2: Sampling Theory

Statistics, Economics

Uploaded by

bantaleme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views51 pages

Chapter 2: Sampling Theory

Statistics, Economics

Uploaded by

bantaleme
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 51

Chapter Two

Sampling Theory

1
Sampling Theory: it’s the study of relationships existing between
the population and the samples drawn from the population.

Sampling: the process/method of sample selection from the target


population.

Population: it’s an aggregate of items possessing a common


characteristic.
 Population may be finite or infinite;

Finite Population: it has a definite and certain number of items.

Infinite Population: it has uncertain and infinite number of items.


2
 The population may be hypothetical or existent;

Hypothetical Population: the population in which it does not exist


really, only imagine the items constituting the population.
 E.g.; tossing of a coin or throwing a dice are hypothetical
population.

Existent Population: a population of concrete objects, it implies the


population where the items constituting the population really exist.
 E.g.; No of students scored grade ‘A’ in course of Introduction to

Economics last year, and joined to Economics department.

3
 There are different concepts related to sampling in sampling theory,

in which it’s applicable only to random samples.

Population: the group of all elements/observations; such as persons,


animals, objects, measurements, etc. under consideration in a certain
problem.
 E.g.; All students in the University, all households in Addis Ababa city,
all light bulbs produced by a firm in a single day, all fish in a lake,… etc.

Census: the collection of data from the whole population, i.e.

complete enumeration of elements.


 It’s the actual measurement/observation of all possible elements from the

population or survey of everyone in the population. 4


Target Population: the population of interest to which researchers

would like to generalize the results of the study.


 E.g.; In the study of the effect of the new fertilizer on crop yield,

then the reference/target population will be all farmers who are

using the new fertilizer.

Sample: the portion/subset of the population taken to make some


generalizations about the population.
 Samples are the representative of the population.

5
Sampling: the process of conducting the selection of a finite number
of elements from the given population of interest for the purposes of

an inquiry/study.

 In industry, the quality of a product assessed through sampling.

 Public opinion on social, economical and political problems

ascertained through sampling.

 Sampling ensure the samples accurately represent the population

of interest for the purpose of an inquiry.

6
Parameter: any measurable characteristic of a population.
 Population mean, standard deviation, median, etc.

Statistic: the number resulting from manipulation of sample data,


i.e. any measurable characteristic of a sample.
 Sample mean, sample standard deviation, sample median, etc.

 Statistic estimate the population parameter, such as population

mean (µ), population standard deviation (δ), etc.

Sampling Unit: the ultimate unit to be sampled or elements of the

population to be sampled.

7
 In the studies of “the socio-economic status of the households”,

household will be the sampling unit.


 In the studies of “the performance of freshman students in some

college”, student will be the sampling unit.

Sample Size: the number of elements or observation to be included


in the sample.

Sampling Frame: it’s the list of all elements in a population.


 List of households in conducting inquiry on households.
 List of students in the registrar office, in conducting inquiry
on students.
8
Errors in Sample Survey
 There are two types of errors in sample survey.

Sampling Error: it’s the variation between the population


value and sample value.
 It may arise due to inappropriate sampling techniques
applied in sample survey.

Non-sampling Errors: it arises due to procedure bias, such as:


 Incorrect responses

 Measurement errors

 Errors at different stages in processing the data. 9


Reasons for Sampling
 There are different reasons for taking sample and make conclusion

about the population.

Cost constraint; taking sample survey has being less costly than a

census.

Time constraint; sampling will be much faster than taking a census.

Improved accuracy; taking sample may give more accurate results

and causes fewer human errors than census.

 It’s, gathering data from fewer sources tends to be more complete


10
in gathering and tabulation
Vast scope of investigation/infinite population; taking sample
makes the investigation to be easy for vast investigation and infinite

population

 It avoids destructive test/wrongness.


 Sometimes taking a census makes more sense than using a sample.
 Some of the reasons are;
 Universality,
 Qualitativeness,
 Detailedness, and
 Non-representativeness
11
Sampling Techniques/Methods
 In too large population, usually a representative sub-group of

the population (sample) included in the investigation.

 Sampling involves the selection of a study units from a defined

population, and ensure that the sample accurately represents

the population under investigation.

 There are two types of sampling techniques.

 Probability sampling

 Non-Random/non-probability sampling
12
Probability Sampling: method of sampling in which all

elements in the population have a pre-assigned non-zero

probability to be included in the sample.

 In probability sampling, every unit of the population has the

chance to be selected.

 The basic types of probability sampling methods are;


 Simple random sampling

 Stratified random sampling

 Cluster sampling

 Systematic (Quasi- random) Sampling


13
 Multistage sampling
Simple Random Sampling: the method of selecting items from
the population such that every possible sample of specific size has
an equal chance of being selected.
 All elements in the population have the same pre-assigned non-
zero probability to be included in to the sample.
 In simple random method, sampling may be;
 With replacement (observations may selected more than once),
 Without replacement (observations have selected once, can’t
be selected again).
 Simple random sampling can be done using the lottery method
14
and table of random numbers.
 Procedures needed to select a simple random sample are:
 Make a numbered list of all units in the population
(sampling frame).
 Each unit on the list should be numbered in sequence from 1
to N (the size of the population).
 Select the required number of study units (sample size),
using a method of "lottery" or a table of random numbers.

15
 Procedures needed to select a sample using lottery method are:
 Numbered the tickets representing a unit in the population.
 The tickets are thoroughly mixed, and the number of tickets
equal to the sample size (for a sample of 200 students, the
researcher would select 200 tickets).
 Then, the sample has to be consists all units of the
population corresponding to the selected tickets.

16
Table of Random Numbers: it’s table of the digits 0, 1, 2,…, 9 and each digit
having an equal chance of selection at any draw.
 It’s a list of numbers generated by a computer (Software, such as SPSS,
Stata, etc.).
 The procedures for generating random numbers are;
 The researcher assigns a number to each unit of the population and
constructs the random table.
 The investigator randomly selects a starting point, and goes across the
rows or down the columns, and list the numbers as they appear on the
table.
 Members of the population with the selected numbers constitute the
sample.
17
 It’s possible for a unit's number to be selected more than once.
 For instance, the investigator have been asked to perform a
survey in a prison. The list of all 2000 prisoners has been given
to the investigator. Investigator think that a sample of 300 would
be satisfactory for the investigation. If investigator want to
choose 300 of them for interview randomly, investigator can use
a random number generator to generate 300 numbers between 1
and 2000.
o But, most of the time, investigator would have some repeated
numbers that should be replaced by new numbers.

18
Advantage of Simple Random Sampling
 Ensures that the sample has being unbiased, in which every
individual and every sample has an advantage of being chosen.

Dis-advantages of Simple Random Sampling


 Simple random sampling requires a sampling frame (list of all
elements in the population), and it’s sometimes impossible.
 It’s difficult to take samples in scattered (randomly distributed)
reference population.
 In extremely large population, it’s tedious (time consuming) for
numbering and sample selection.
19
 Minority sub-groups of interest in the population may not be
represented in the sample.
 In applying the table of random numbers, investigator have to
ignore repeated digits and those lying above the range of the
population size.

20
Stratified Sampling
• It is applied in heterogeneous population.
 The population will be divided in to non-overlapping (having
different character), but exhaustive groups called strata.
 Samples will be chosen from each stratum based on simple
random or systematic sampling method.
 Elements in the same strata should be homogeneous, while
different in different strata.
 The strata are made according to various homogeneous
characteristics, such as gender, race, region, age, religious,
21
 Stratified sampling can be;
 Proportionate (drawn proportionate number of elements)
 Non-proportionate (equal number of elements are drawn from
each stratum).

Proportionate Stratified Sampling: the sampling in which the


number of units selected from each stratum have to be directly
proportional to the size of the strata.
 If ‘P’ represents the proportion of population included in ‘i’
stratum and ‘n’ represents the total sample size, the number of
elements selected from ‘i’ stratum will be n*Pi.
22
Non- proportional Stratified Sampling: the sampling in which the
number of items studied in each stratum are dis-proportionate to the
respective members in the population.
 In non-proportional stratified sampling, an equal number of
elements are selected from each stratum.
 Suppose that a sample size of 30 to be drawn from a population
size of 8000, in which it divided in to three strata 4000, 2400 and
1600.
i) Adopting proportional allocation, find the sample sizes under
each stratum.
ii) By using non-proportional sample, find the sample sizes under
23
Solution
i) The sample size for the different strata will obtained:
N1= 4000, P1 = 4000/8000 = 0.5, hence n1= n*P1 = 30*0.5 = 15

N2 = 2400, P2 = 2400/8000 = 0.3, hence n2 = n*P2 = 30*0.3 = 9


N3 = 1600, P3 = 1600/8000 = 0.2, hence n3 = n*P3 = 30*0.2 = 6
 Therefore, the sample sizes for different strata are 15, 9 and 6
respectively, proportionate to the sizes of the strata 4000, 2400,
and 1600.

24
ii) To get the non-proportional sample size, since there are total of
three stratum (sub-groups) and 30 samples to be selected, the three
stratum has equal sample size from the selected sample of 30, in
which it divided by 3, 10 samples have to selected in each stratum.
 Therefore, the investigator have to select 10 elements from the
three stratum regardless of the number of elements in each
stratum (10 from each of 4000, 2400 and 1600 equally).

25
Advantages of stratified sampling
 Unlike simple random sampling, Minority sub-groups of
interest in the population be represented in the sample.
Dis-advantages of Stratified Sampling
 If there are many variables of interest, dividing a large population in
to representative sub-groups requires a great deal of effort.
 If variables are some-what complex or ambiguous (such as; beliefs,
attitudes,…etc.), it’s difficult to separate individuals in to the sub-
groups according to the selected variables.

26
Cluster Sampling: clusters are formed in a way that, population has
to be divided in to non-over-lapping groups (clusters).
 Elements within a cluster are heterogeneous, but, elements between
the cluster are homogeneous.
 It’s useful when generate a simple random sample has been
ambiguous or costly.
o To estimate the average annual household income in a large city, the

investigator use cluster sampling, because simple random sampling

needs a complete list of households in the city.

o A sample of clusters could then be randomly selected, and every

household within the clusters could be interviewed to find the average


27
 Some procedures used in cluster sampling are;

 Divided the reference population in to clusters (sub-groups),


preferably similar in size.
 Randomly or systematically taken a sample of the clusters.

 All the units in the selected clusters may under the studies or
the investigator may select samples from each cluster.
o If investigator may select samples from each cluster, the
investigator select a sample of clusters, and then select
elements from each cluster.

28
 Cluster sampling has the advantages;

 A list of all individual study units in the reference population

have not be required, and it may reduces cost.

 Simplify field work and it’s convenient

 Cluster sampling has the dis-advantages;

 The members of the clusters are often homogeneous than

members of the whole population, and it may not be

representative.

 The elements in a cluster may not have the same variation in

characteristics as elements selected from the population. 29


Systematic (Quasi- random) Sampling: it required a complete
list of all elements with in the population (sampling frame).
 In systematic sampling, the elements to be included in the
sample are picked at a constant interval (sample interval
denoted by k), at which the items of the population are arranged
in some order, and a random starting point will be selected from
1 through k.

, and members of the population will be selected for sample,


where; N- population, n-sample.

30
 For any number between 1 and k, i.e. 1 ≤ j ≤ k, the units
selected first, and then , , … etc. will be selected until the
required sample size will be reached.
 Suppose there are 2000 subjects in the population and a sample
size of 50 subjects are needed.
o Then; the sampling interval (k) = = = 40

o The number of the first subject to be included in the sample


will be chosen randomly by blindly picking up one out of 40
pieces of paper numbered 1 to 40 (lottery method).

31
o Lets, 12 to be the first subject selected, then the sample interval
(k) will be added to each subsequent elements, and the sample
would consist of samples whose numbers were 12, 52, 92, 132,
… etc. until 50 subjects (samples) are obtained in the way , , …
etc.
 In systematic random sampling, all members of the population
have no equal chance of being selected, it’s not strictly random.
 Suppose a researcher wants to know the impact of micro-finance on
the clients' household income. The researcher wishes to select 10
clients out of 250 clients using systematic sampling.
o How the research assistant to select a systematic sample of 10 clients?
32
 Systematic sampling has the advantages;
 Less time consuming and easier to perform than SRS.
 It’s more convenient to use as compared to SRS.
 It provides a good approximation to simple random sampling.

 Systematic sampling has the advantages;


 If there’s any sort of cyclic ordering of the subjects, the samples
will not be representative of population.
 Lets, subjects in the population are arranged in a manner;
defective item, non-defective item, defective item, non-
defective item, … etc.

33
o The selection of the starting point could produce a sample of all
defective items or non-defective items depending on the even or odd
sample interval (k).
 If the starting point has defective item, and added even ‘k’, the
sample will be all defective item.
 If the starting point has non-defective item, and added even ‘k’,
the sample will be all non-defective item.

34
Multi-stage Sampling: it has used for large and widely scattered.
target population.
 In multi-stage sampling method, the investigator do several
sampling steps (primary sampling unit, secondary sampling
unit, the third sampling unit, … etc.,), and use different suitable
sampling methods in different sampling steps.
o In studying in district level, the primary sampling unit can be
the districts, the secondary sampling unit can be the kebeles, the
third sampling unit will be village, etc.
 It uses both probability and non-probability sampling techniques, but the sample
that represent the population has to be selected using probability sampling.
35
Non -probability Sampling
 It’s a sampling technique in which the choice of units selected for
a sample depends on the basis of convenience, and personal
interest.
 In non-random sampling, each elements of the population has not
a chance of being included in the sample.
 The process of sample selection involves at least some degree of
personal subjectivity.
 Units included in the sample have been selected at the discretion
of the researcher, and samples derive from the judgment of the
researcher.
36
 In non-probability sampling technique, sampling frame (list of
all elements in the population) doesn’t necessary.
 In non-probability sampling technique, there’s non-random
selection of the sample (unrepresentative to the population).
 It’s inappropriate if the investigation aims to measure variables
and generalize the findings about the population on the basis of
the sample.
 It’s easier, quicker and cheaper to carryout than probability
sampling designs.

37
Dis-advantages of Non-probability Sampling Technique
 Non-probability sampling has the dis-advantages are;

 Non-probability sampling depends exclusively on uncontrolled


factors and researcher's insight, and there’s no statistical method
to determine the margin of the sampling errors.
 Sometimes such samples are based on an absolute frame, which
does not adequately cover the population.
 The result obtained may not be generalized for the entire
population.

38
Advantages of Non-probability Sampling Technique
 Non-probability sampling has the advantages are;
 Much less complicated, less expensive.
 Very convenient
 More adequate for vast investigation,
 Undesirable to generalize the findings beyond the sample.

39
 There are four non- probability sampling methods.

Purposive (Judgmental) Sampling: taking the sample which has


direct or indirect control over which items are selected for the
sample on the basis on a certain criteria.
 The researcher use subjective choice in drawing samples that
more informative for a study undergoing.

Convenience Sampling: selecting a sample from the population


in a manner relatively easy and convenient.
 The primary concern of convenience sampling has to choose
with ease of access.
40
 Convenience sampling;

 Convenient

 Ineffective

 Highly unrepresentative sample

 High bias and systematic errors

 Least reliable but cheap and easy to collect.

 No control to ensure precision.


 Selecting respondents, and interviews conducted in inconvenient
locations, such as; lounge, road, market place, etc. are convenience
sampling
41
Quota Sampling: the sampling method that ensures a certain
number of sample units represented from different categories
with specific characteristics.
 Quota sampling can be applied for affirmative action.

Snowball Sampling: the method of sampling for identifying one


or a few of special populations.
 The researcher locates an initial set of respondents used as
informants to identify others with desired characteristics.

42
Problems in Sampling
 Problems in sampling may non-sampling errors or sampling
errors.

Non-sampling Errors: the different non-sampling errors happened


in the sample are;

Non-coverage Error: it refers to sample frame defects.


 Omission of the part of the population; such as soldiers, students,
people in hospital are typically excluded from national survey.
 Omission of the people who do not have a telephone in
telephonic surveys.
43
Sampled wrong population: the sample may not drawn from the
population.

Non-response Error: it occurred due to refused interview, because they may


too busy, or they do not trust the interviewer or they may not be
interested for interview.

Instrumental Errors: error in instrument device to collect data.


 Carelessly worded questions in the questionnaire may leads to mis-
interpretation, and wrong responses.

Interviewer Errors: some characteristics of the interviewer (age, gender,


race, …etc.) may affects the way in which the respondent answers questions.
 Questions about racial discriminations might be differently answered
depending on the racial group of the interviewer. 44
Ways to Reduce/Correct Non-sampling Error
 Some of the ways of reducing/correcting non-sampling errors are;
 Ensure that survey instruments are well prepared, simple to read
and easy to understand.
 Properly select and train interviewer to control data gathering
bias or error.
 Use sound editing, coding, and tabulating procedures to reduce
the possibility of data processing error.

45
Sampling Error
 Sampling errors may occurred due to;

 Random variation in the sample estimate of the true population


parameter.
 the discrepancy between the population value (parameter) and
sample value (statistic).
 Inappropriate sampling technique applied.

 Sampling error can be minimized by increasing the size of the


sample.
 if n = N, sampling error becomes zero.
46
Required Sample Size
 The sample size should be a function of the variation in the
population parameters under investigation, and the estimating
precision needed by the researcher.
 Some principles that influence sample size are;
 The greater variance within the population, the larger the sample
must be to provide estimation precision.
 The greater the desired precision of the estimate, the larger the
sample must be sampled.
 The narrower the interval range, the larger the sample must be
sampled. 47
 The higher the confidence level in the estimate, the larger the
sample must be sampled.
 The greater the number of sub-groups of interest within a
sample, the greater the sample size must be taken, as each sub-
group must meet the minimum sample size requirements.

48
Sample Size Estimation using Yamane and Cochran and
Krejcie and Morgan and Green Formulas
1. Taro Yamane (1973) if population is known

Where n = sample size N = population size = 𝑒 = error (0.05)


reliability level 95% or; 𝑒 = level of precision always set the value
of 0.05
Example: if population size N =37581 find sample size n
Solution

49
Sample Size Estimation using Yamane and Cochran and
Krejcie and Morgan and Green Formulas
2. Krejcie & Morgan Formula (Krejcie & Morgan, 1970)
• If the population size is known the sample size n is:

Where n = sample size N = population size , 𝑒 = acceptable error of


sample size =0.05, = Chi-square 𝑑𝑓 = 1 and reliability level 95%
( = 3.841) 𝑝 = the population proportions (Assumed to be 0.5);
If population size 37581, determine sample size using Krejcie &
Morgan Formula
Solution

50
Cont’d
3. Cochran Formula (Cochran, 3.2 If the population size is
1977) unknown and the population
3.1 If the population size is proportion is unknown
unknown but a lot, the population
proportion is know Example: from the above example

Where 𝑛 = sample size 𝑝 = the


find sample size assuming

population proportion (𝑝 = 0.1) 𝑒 unknown. Solution:


population proportion p is

= acceptable sampling error (𝑒 =


0.05) and z= 𝑧 value at reliability

Example: Given 𝑧 = 𝑧 value at


level or significance level

0.01; 𝑧 = 2.58 , p=0.1, e=0.05;


reliability level or significance level

calculate sample size n


51

You might also like