0% found this document useful (0 votes)
2 views

Sampling

Sampling is the process of selecting a smaller group from a larger population to make statistical inferences about the whole. It is crucial for research due to cost-effectiveness, accessibility, and the ability to generate representative data. Various sampling techniques exist, including probability and non-probability methods, each with its own advantages and disadvantages.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Sampling

Sampling is the process of selecting a smaller group from a larger population to make statistical inferences about the whole. It is crucial for research due to cost-effectiveness, accessibility, and the ability to generate representative data. Various sampling techniques exist, including probability and non-probability methods, each with its own advantages and disadvantages.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

SAMPLING

Sample: Definition
2

 A smaller group/subset of elements (people, animals, or items)


containing the characteristics (i.e. representative) of a larger
population to be used in statistical analysis to generalize facts about
the population
 When the population size is too large to include all possible members,
the sample is a more manageable version of a too large population
 It should represent the population without any bias toward a specific
attribute.
 Faster and cheaper tool to study and make inferences about the entire
population
Population Vs. Sample
3
Importance
4

 Cost in terms of money, time and manpower


 Faster and cheaper than asking the entire population
 Accessibility
 Easy to do research because the measurements are easily collected.
 Utility/usage
 Exmp: during a laboratory test, the patient whole blood cant be drawn, a sample
would give enough information about the rest of the blood.
Characteristics of Good Samples
5

Representation
 Studies are NOT conducted to examine and describe the sample, rather to
understand the larger population.
 Lots of efforts have been devoted to develop sampling techniques to
generate representative sample.
 Factors determine sufficient representation:
 Sampling procedure
 Sample size
 Participation response (high vs. low)
Characteristics of Good Samples
6

Representation
 Two keys

 Selecting the right people


 Have to be selected scientifically so that they are representative of the population
 Selecting the right number of the right people
 To minimize sampling errors I.e. choosing the wrong people by chance
Sampling frame
7

 The list of information (name, phone number, address, IDN) for the
population elements (human, animals, organizations, items) that is going to
be in the target sample and study sample.
 It is a set of source elements, with particular description/criteria/characteristics,
from which the sample is selected.
 Usually, the list is not inclusive of all the population.
 e.g. name list of MS patients in KAH vs. in Jordan
Sampling bias
8

Having unequal (higher/lower) probability of an element being selected to a


study sample, due mainly to lack of randomization.
 Reasons:
 Non-randomization sampling
 Small sample size
 Self-selection: individuals with specific characteristics (individuals with positive vs.
negative experiences) select themselves into/out of the research sample
 Under/overcoverage of part(s) of the study population during sampling
 Patient non-response: the inability/refusal of part of the population to participate
 Survival bias: being selection/not selected due to visibility (e.g. health reasons).
 Advertising/Pre-Screening Bias: the selection criteria or advertising style or
language might encourage certain people more than others
Census vs. Study population
9

 A census: is a sample consisting of the entire population.


 Very representative as it gives every small details of the population.
 Disadvantages:
 Expensive
 Takes a long time
 Cumbersome & associated with data collection errors
 Circumstances for sampling the entire population
 Very small population
 Having extensive resources
 Expecting a very low response
 Study population:
 Includes the individuals who meet the study criteria and agreed to participate
 Measurements are obtained in this portion of the population
Population vs. sample
10
Population vs. sample
11
 Population (universe): The entire set of elements (persons, animals, or objects)
 Target population: A group of elements with some common characteristics
 Meet a set of sampling criteria
 The population we wish to understand and make generalizations about
 Men above 60 with type 1 diabetes, young women with hypertension, young men
with Parkinson’s
 Source/accessible population: the portion of the population the researcher
has reasonable access
A subset of the target population
 May be limited to region, country, city, county, or institution
 Institutionalized young men with Parkinson’s
Population vs. sample
12

 Sample population: the selected and invited elements of the source


population participation in a study
A random selection of members of a population.
 A smaller group share the characteristics of the entire population.
 Institutionalized young men with Parkinson’s in Irbid
 Study population: the portion agreed to participate
 Met the inclusion and exclusion criteria, were invited, and agreed to participate
 Measurements are obtained in this portion of the population
 The inferences, from the observations and conclusions in the sample, are attributed
to the whole population.
Population vs. sample
13
Study subjects
Hierarchy of sampling The actual
participants in
14 the study
Study population
Subjects who are
selected
Sample Population
The list of potential subjects
from which the sample is
drawn
Source population
The Population from whom the study
subjects would be obtained

Target population
The population to whom the results would be
applied
Sampling Techniques
15
Probability Sampling
16
Probability Sampling
17

 Randomization is used thus every member of the population has an equal


probability of being included in the selected sample.
 Alsoknown as random sampling.
 Usually used in quantitative research

 Perfect to generate representative sample of the whole population


 Four types:
 Simple random sampling
 Systematic sampling

 Stratified sampling

 Cluster sampling
Simple randomization
18

 Every element of the population has an equal chance of being selected.


 Used when there is no prior knowledge about the target population
demographics.
 Usually the target population is “smaller”
 The sampling frame includes the whole population.
 Lottery method
Simple randomization
19

 Steps:
 Each element randomly assigned a number then
 Randomly selecting from those numbers through an automated process

 The chosen numbers are included in the sample.

 Tools like random number generators are usually used:


 Computer generated random numbers table
 Draw numbers for box (hat)

 Bingo #=s
Systematic sampling
20

 Each member of the group is selected at regular interval to form a sample.


 An equal opportunity for every member of a population to be selected
 Every “nth” individuals are chosen to be a part of the sample.
 e.g. the 5th person is selected to be in the sample.


Systematic sampling
21

Steps
 Define the population

 Determine the desired sample size (n)

 List the population from 1 to N

 Determine K, where k=N/n

 Select a random number between 1 and k, let us denote this number by “a”.

Starting at a, take every Kth number on the list until the desired sample is
obtained.
 Then the selected list will be

 a, a+k, a+2k, a+3k, …, a+(n-1)k


Systematic sampling
22

Example
 N = 1200, and n = 60

 Sampling fraction = 1200/60 = 20

 List persons from 1 to 1200

 Randomly select a number between 1 and 20 (e.g. 8)

 1st person selected = the 8th on the list

 2nd person = 8 + 20 = 28th list e.t.c.


Stratified sampling
23


Stratified sampling
24

 Stratifying/dividing an extensive population into smaller groups that usually


don’t overlap yet represent the entire population.
 The groups are usually stratified according to sex, age, ethnicity, and similar ways.
 Subsequently, randomly select equal number of participants from each
group, using simple randomization or systematic sampling.

Cluster sampling
25

 Appropriate for a large population spread out geographically


1. The population is divided into smaller subgroups (i.e. clusters), not
necessarily of the same size
2. The clusters are divided according to a characteristics (gndr, location)
 The elements between clusters are heterogeneous (e.g. non-overlapping
subpopulations) however are homogeneous within each cluster
 Everybody should be included only in one cluster without anybody left out
Cluster sampling
26

3. Using simple random sampling, select some of the clusters to be included in


the study.
 This group should be representative of all LBP patients in the Jordan
4. Should have a prior idea the participants number
5. More cost-effective but less efficient than simple random or stratified
sampling.
Cluster sampling
27
Cluster sampling
28

 Example: collecting LBP patients opinions about RS in Jordan


1. Divide Jordan into clusters (e.g. regions (Nth, Sth, Md) or major cities)
2. Identify the hospitals within each region
3. Using a “simple random” or “systematic random” sampling technique, select the
hospitals that going to be included in the study
 The participants in the selected clusters are combined as one group to be included
in the study
 This group should be representative of all LBP patients in the Jordan

 Should have a prior idea the participants number


Non-probability Sampling
29

 The samples are selected according to the subjective judgment of the


researcher rather than random selection.
 Depends heavily on the researcher experience
 Less stringent.
 Often used for expletory studies such pilot studies and qualitative research.
 Not all members of the population have an equal opportunity
 The pool of subjects are predetermined (e.g. individuals who used a
particular lotion)
Non-probability Sampling
30

 Subjective to bias
 Used when it is difficult to
use random probability
sampling due to time/cost
constrains.
 Easier, faster and less expensive
Convenience sampling
31

 Most common non-probability sampling method, especially to observe


habits, opinions, and viewpoints
 Used when the population is too large and difficult to reach the entire

population
 The possible sampling error:

1. Lack of representation of population


 The representativeness of the sample is not a propriety
2. Potential bias


Convenience sampling
32

 Samples are selected because they are conveniently available to the


researcher.
 The participants are chosen purely based on proximity/convenience
 Easy to recruit, speedy, and cost-effective
 There are no criteria required to be a part of this sample, thus, it becomes
incredibly simplified to include elements in this sample.
 People are eligible if they are available and willing to participate.
 All elements of the population are eligible and dependent on the researcher’s
proximity to get involved in the sample.
 Example: Picking people from a mall to answer questions about a product.
Convenience sampling
33
Consecutive sampling
34

 Very similar to convenience sampling, however the researcher picks a


person/sample, conducts research over a period, analyzes the results,
and then moves on to another person/sample if needed (e.g.
verify/confirm/correct previous results or reach conclusive results).
 In research, usually either the null or the alternative hypotheses are accepted.
In consecutive sampling, however, if neither are acceptable, the researcher
select another pool of samples to conduct the experiment once again before
finally making a research decision.
Consecutive sampling
35

 Samples are recruited several times to collect different, yet related, information
about a particular phenomena
 It gives the researcher answer many RQs to fine-tune the research of various
topics with vital insights.
 Should meet preset inclusion/exclusion, then select based on convenience
Consecutive sampling
36

 Advantages: easy to recruit, valuable, and cost-effective


 Disadvantages: time consuming and potential bias

 Example:

1. Picking people from a mall to answer questions about a product


 Provide initial information/opinion about the product
2. After analyzing these results, a subsequent similar sample are recruited and
asked whether the participants would buy the product.
 Additional information
Consecutive Sampling
37
Snowball Sampling
38

 Samples are generated purely from referrals from existing subjects


 Also called chain-referral sampling
 Since it’s a referral process, the sample size progressively increases (snowballing).
 Member(s) of the target population nominate and provide contact details for
other members of the same population
 Used when a population is unknown and rare, thus tough to select and
assemble a samples.
 Initial contact could be through organization/support groups
 Samples have rare traits/less-researched diseases.
 progeria, porphyria, Alice in Wonderland syndrome etc.
 Patients can be contacted to “convince” them to participate
Snowball Sampling
39
Snowball Sampling
40

Advantages
 Referrals make it easy, quick, convenient and inexpensive to find subjects.

 Cost effective as the referrals are obtained from a primary data source.
 Easy to meetup with hesitant subjects to convince them
 Easier to hard-to-contact participants (e.g. sexually transmitted disease)
Snowball Sampling
41

Disadvantages
 Sampling bias: the sample demographics might be “too homogeneous”
because individuals refer others with similar traits resulting in
Unrepresentative sample
 Lack of cooperation: Even after referrals, people might refuse to
participate.
 Invasion of privacy: contact information should be obtained after the

referred person consent


Quota Sampling
42

The study sample size and the proportion in


each age group is decided according to the
researcher experience
Quota Sampling
43

 The non-probability version of stratified sampling


 Obtain a certain number representation (quota) of sample units with

different categories (e.g. age groups) of specific characteristics (e.g. CHD),


without randomization.
 The sample represent some specific characteristics of the population.
 The population is first segmented into mutually exclusive sub-groups (age
groups) according to a particular characteristic (CHD)
 After interviews, the participants are selected based on a specified proportion
and according to the investigator expertise and judgments (i.e. non-
randomization) to fulfill the quota (needed number of certain criteria).
 Bias & unrepresentative sample are the most common error of the technique
Quota Sampling
44
Judgment Sampling
45
Judgment Sampling
46

 Similar to quota sampling, it depends on the researcher knowledge,


experience, thus judgment.
 The study sample is directly “handpicked” from the target population without
clustering/quota
 Also called, purposive, judgmental, selective, authoritative, or subjective

 Used when limited potential number of people with certain qualities or traits
are found in the target population
 Opinion of highly intellectual people or knowledgeable about a particular topic
 e.g. professor opinion of the university services
 e.g. physician views of the hospital privacy procedures
Sampling bias
47

 Consequences:
 Unrepresentativeness: results can be generalized only to similar population
 Erroneous results: systematic over/under estimation a parameter/measure
 Misleading and inaccurate conclusions
 Avoiding:
 Non random sampling
 Use computer-based selection
 Predetermined scientifically calculated sufficient sample size to “dilute” potential
bias
 Sufficient and accurate examination of the target population characteristics
 Compiling complete and accurate directory of the study population
 Segments, ages, genders, distributions….ect
Sample size
48

 Larger sample size:


1. Increases accuracy and representation
2. More power
3. Costly
4. Difficult to manage

 Smaller sample size:


1. Decreases accuracy and representation
2. Less power
3. Less expansive
4. Easier to manage
Sample size: Terms
49

 Population size: the number of people sufficient for the study to give
accurate results (sufficient statistical power)
 Confidence level (CL): How sure that your data is accurate. It is expressed in
percentage and aligned to the confidence interval (CI).
 e.g.CL=90% → the would be most likely be 90% accurate.
 The CL corresponds to a Z-score:
 90% – Z Score = 1.645
 95% – Z Score = 1.96
 99% – Z Score = 2.576
Sample size: Terms
50

 The margin of error (confidence interval): how distant the current sample
from the original population results, usually CI=95%.
 Standard deviation: dispersion of a data set from the mean, the higher the
dispersion, the greater the standard deviation and the greater the
magnitude of the deviation from the mean.
Sample size
51

 Calculations
 Necessary Sample Size = (Z-score)2 * StdDev*(1-StdDev) / (margin of

error)2
 Example: Using a 90% CL, .6 standard deviation, and a margin of error
(confidence interval=95%) of +/- 5%.
 ((1.64)2 x .6(.6)) / (.05)2

 ( 2.68x .0.36) / .002

 .9648 / .0016

 =482.4

 482-483 participants are needed, which is sample size of the study

You might also like