RESEARCH DEVELOPMENT Lesson 6
RESEARCH DEVELOPMENT Lesson 6
(LESSON 6)
BY
MR SHADRECK PEARSON
ORGANIZED BY
To draw valid conclusions from your results, you have to carefully decide how you will
select a sample that is representative of the group as a whole.
1. Probability sampling
Involves random selection, allowing you to make strong statistical inferences about the
whole group.
2. Non-probability sampling
You should clearly explain how you selected your sample in the methodology section of your
paper or thesis.
POPULATION VS SAMPLE
First, you need to understand the difference between a population and a sample, and
identify the target population of your research.
The population is the entire group that you want to draw conclusions about.
The sample is the specific group of individuals that you will collect data from.
The population can be defined in terms of geographical location, age, income, and many
other characteristics.
It can be very broad or quite narrow: maybe you want to make inferences about the whole
adult population of your country; maybe your research focuses on customers of a certain
company, patients with a specific health condition, or students in a single school.
It is important to carefully define your target population according to the purpose and
practicalities of your project.
1
Sample frame
The sampling frame is the actual list of individuals that the sample will be drawn from.
Ideally, it should include the entire target population (and nobody who is not part of that
population).
You are doing research on working conditions at Company X. Your population is all
1000 employees of the company. Your sampling frame is the company’s HR database
which lists the names and contact details of every employee.
Sample size
The number of individuals you should include in your sample depends on various factors,
including the size and variability of the population and your research design. There are
different sample size calculators and formulas depending on what you want to achieve
with statistical analysis.
Probability sampling means that every member of the population has a chance of being
selected. It is mainly used in quantitative research. If you want to produce results that are
representative of the whole population, probability sampling techniques are the most
valid choice.
2
1. Simple random sampling
In a simple random sample, every member of the population has an equal chance of being
selected. Your sampling frame should include the whole population.
To conduct this type of sampling, you can use tools like random number generators or
other techniques that are based entirely on chance.
You want to select a simple random sample of 100 employees of Company X. You assign
a number to every employee in the company database from 1 to 1000, and use a random
number generator to select 100 numbers.
2. Systematic sampling
Systematic sampling is similar to simple random sampling, but it is usually slightly easier
to conduct. Every member of the population is listed with a number, but instead of
randomly generating numbers, individuals are chosen at regular intervals.
3
Example: Systematic sampling
All employees of the company are listed in alphabetical order. From the first 10 numbers,
you randomly select a starting point: number 6. From number 6 onwards, every 10th
person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of
100 people.
If you use this technique, it is important to make sure that there is no hidden pattern in the
list that might skew the sample. For example, if the HR database groups employees by
team, and team members are listed in order of seniority, there is a risk that your interval
might skip over people in junior roles, resulting in a sample that is skewed towards senior
employees.
3. Stratified sampling
Stratified sampling involves dividing the population into subpopulations that may differ
in important ways. It allows you draw more precise conclusions by ensuring that every
subgroup is properly represented in the sample.
To use this sampling method, you divide the population into subgroups (called strata)
based on the relevant characteristic (e.g. gender, age range, income bracket, job role).
Based on the overall proportions of the population, you calculate how many people
should be sampled from each subgroup. Then you use random or systematic sampling to
select a sample from each subgroup.
The company has 800 female employees and 200 male employees. You want to ensure
that the sample reflects the gender balance of the company, so you sort the population
into two strata based on gender. Then you use random sampling on each group, selecting
80 women and 20 men, which gives you a representative sample of 100 people
4. Cluster sampling
Cluster sampling also involves dividing the population into subgroups, but each subgroup
should have similar characteristics to the whole sample. Instead of sampling individuals
from each subgroup, you randomly select entire subgroups.
If it is practically possible, you might include every individual from each sampled cluster.
If the clusters themselves are large, you can also sample individuals from within each
cluster using one of the techniques above. This is called multistage sampling.
4
This method is good for dealing with large and dispersed populations, but there is more
risk of error in the sample, as there could be substantial differences between clusters. It’s
difficult to guarantee that the sampled clusters are really representative of the whole
population.
The company has offices in 10 cities across the country (all with roughly the same
number of employees in similar roles). You don’t have the capacity to travel to every
office to collect your data, so you use random sampling to select 3 offices – these are
your clusters.
This type of sample is easier and cheaper to access, but it has a higher risk of sampling
bias. That means the inferences you can make about the population are weaker than with
probability samples, and your conclusions may be more limited. If you use a non-
probability sample, you should still aim to make it as representative of the population as
possible.
1. Convenience sampling
5
A convenience sample simply includes the individuals who happen to be most accessible
to the researcher.
This is an easy and inexpensive way to gather initial data, but there is no way to tell if the
sample is representative of the population, so it can’t produce generalizable results.
You are researching opinions about student support services in your university, so after
each of your classes, you ask your fellow students to complete a survey on the topic. This
is a convenient way to gather data, but as you only surveyed students taking the same
classes as you at the same level, the sample is not representative of all the students at
your university.
Voluntary response samples are always at least somewhat biased, as some people will
inherently be more likely to volunteer than others.
You send out the survey to all students at your university and a lot of students decide to
complete it. This can certainly give you some insight into the topic, but the people who
responded are more likely to be those who have strong opinions about the student support
services, so you can’t be sure that their opinions are representative of all students.
3. Purposive sampling
This type of sampling, also known as judgement sampling, involves the researcher using
their expertise to select a sample that is most useful to the purposes of the research.
It is often used in qualitative research, where the researcher wants to gain detailed
knowledge about a specific phenomenon rather than make statistical inferences, or where
the population is very small and specific. An effective purposive sample must have clear
criteria and rationale for inclusion. Always make sure to describe your inclusion and
exclusion criteria.
6
You want to know more about the opinions and experiences of disabled students at your
university, so you purposefully select a number of students with different support needs
in order to gather a varied range of data on their experiences with student services.
4. Snowball sampling
If the population is hard to access, snowball sampling can be used to recruit participants
via other participants. The number of people you have access to ―snowballs‖ as you get in
contact with more people.
You are researching experiences of homelessness in your city. Since there is no list of all
homeless people in the city, probability sampling isn’t possible. You meet one person
who agrees to participate in the research, and she puts you in contact with other homeless
people that she knows in the area.
4. Quota Sampling
For instance, if a researcher wants to study public opinion on a political issue, they might
use quota sampling by selecting a certain number of participants from different age
groups, genders, and income levels. They might interview 100 people under 30, 100
people between 30 and 50, and so on, until each quota is met. While this method allows
for a diverse range of participants, it can introduce bias because individuals are not
randomly chosen, potentially leading to results that don't accurately represent the entire
population's views.
Ethical considerations in research are a set of principles that guide your research designs
and practices. Scientists and researchers must always adhere to a certain code of conduct
when collecting data from people.
The goals of human research often include understanding real-life phenomena, studying
effective treatments, investigating behaviors, and improving lives in other ways. What
7
you decide to research and how you conduct that research involve key ethical
considerations.
Research ethics matter for scientific integrity, human rights and dignity, and
collaboration between science and society.
These principles make sure that participation in studies is voluntary, informed, and safe
for research subjects/participants.
You’ll balance pursuing important research aims with using ethical research methods and
procedures.
Defying research ethics will also lower the credibility of your research because it’s hard
for others to trust your data if your methods are morally questionable.
Even if a research idea is valuable to society, it doesn’t justify violating the human rights
or dignity of your study participants.
Before you start any study involving data collection with people, you’ll submit your
research proposal to an institutional review board (IRB).
An IRB is a committee that checks whether your research aims and research design are
ethically acceptable and follow your institution’s code of conduct. They check that your
research materials and procedures are up to code.
If successful, you’ll receive IRB approval, and you can begin collecting data according to
the approved procedures. If you want to make any changes to your procedures or
materials, you’ll need to submit a modification application to the IRB for approval.
8
If unsuccessful, you may be asked to re-submit with modifications or your research
proposal may receive a rejection.
To get IRB approval, it’s important to explicitly note how you’ll tackle each of the ethical
issues that may arise in your study.
There are several ethical issues you should always pay attention to in your research
design, and these issues can overlap with each other.
You’ll usually outline ways you’ll deal with each issue in your research proposal if you
plan to collect data from participants.
1. Voluntary participation
Your participants are free to opt in or out of the study at any point in time.
2. Informed consent
Participants know the purpose, benefits, risks, and funding behind the study before they
agree or decline to join.
3. Anonymity
You don’t know the identities of the participants. Personally identifiable data is not
collected.
4. Confidentiality
You know who the participants are but you keep that information hidden from everyone
else. You anonymize personally identifiable data so that it can’t be linked to other data by
anyone else.
Physical, social, psychological and all other types of harm are kept to an absolute
minimum.
6. Results communication
You ensure your work is free of plagiarism or research misconduct, and you accurately
represent your results.
9
DETAILED EXPLANATION ON THE TYPES OF ETHICAL ISSUES IN RESEARCH
1. Voluntary participation.
Voluntary participation means that all research subjects are free to choose to participate
without any pressure or coercion.
All participants are able to withdraw from, or leave, the study at any point without feeling
an obligation to continue.
Your participants don’t need to provide a reason for leaving the study.
It’s important to make it clear to participants that there are no negative consequences or
repercussions to their refusal to participate.
After all, they’re taking the time to help you in the research process, so you should
respect their decisions without trying to change their minds.
When recruiting participants for an experiment, you inform all potential participants that
they are free to choose whether they want to participate, and they can withdraw from the
study anytime without any negative repercussions.
Take special care to ensure there’s no pressure on participants when you’re working with
vulnerable groups of people who may find it hard to stop the study even when they want
to.
2. Informed consent.
Informed consent refers to a situation in which all potential participants receive and
understand all the information they need to decide whether they want to participate.
This includes information about the study’s benefits, risks, funding, and institutional
approval.
You make sure to provide all potential participants with all the relevant information about
You also let them know that their data will be kept confidential, and they are free to stop
filling in the survey at any point for any reason.
They can also withdraw their information by contacting you or your supervisor.
Usually, you’ll provide participants with a text for them to read and ask them if they have
any questions. If they agree to participate, they can sign or initial the consent form. Note
that this may not be sufficient for informed consent when you work with particularly
vulnerable groups of people.
If you’re collecting data from people with low literacy, make sure to verbally explain the
consent form to them before they agree to participate.
For participants with very limited English proficiency, you should always translate the
study materials or work with an interpreter so they have all the information in their first
language.
In research with children, you’ll often need informed permission for their participation
from their parents or guardians. Although children cannot give informed consent, it’s best
to also ask for their assent (agreement) to participate, depending on their age and maturity
level.
3. Anonymity.
Anonymity means that you don’t know who the participants are and you can’t link any
individual participant to their data.
You can only guarantee anonymity by not collecting any personally identifying
information—for example, names, phone numbers, email addresses, IP addresses,
physical characteristics, photos, and videos.
In many cases, it may be impossible to truly anonymize data collection. For example,
data collected in person or by phone cannot be considered fully anonymous because some
personal identifiers (demographic information or phone numbers) are impossible to hide.
You’ll also need to collect some identifying information if you give your participants the
option to withdraw their data at a later stage.
Data pseudonymization
11
Is an alternative method where you replace identifying information about participants
with pseudonymous, or fake, identifiers. The data can still be linked to participants but
it’s harder to do so because you separate personal information from the study data.
You’re conducting a survey with college students. You ask participants to enter
demographic information including their age, gender, nationality, and ethnicity. With all
this information, it may be possible for other people to identify individual participants, so
you pseudonymize the data.
Each participant is given a random three-digit number. You separate their personally
identifying information from their survey data and include the participant numbers in
both files. The survey data can only be linked to personally identifying data via the
participant numbers.
4. Confidentiality.
Confidentiality means that you know who the participants are, but you remove all
identifying information from your report.
All participants have a right to privacy, so you should protect their personal data for as
long as you store or use it. Even when you can’t collect data anonymously, you should
secure confidentiality whenever you can.
Example of confidentiality
To keep your data confidential, you take steps to safeguard it and prevent any threats to
data privacy. You store all signed consent forms in a locked file drawer, and you
password-protect all files with survey data.
Only other researchers approved by the IRB are allowed to access the study data, and you
make sure that everyone knows and follows your institution’s data privacy protocols.
Some research designs aren’t conducive to confidentiality, but it’s important to make all
attempts and inform participants of the risks involved.
In a focus group study, you invite five people to give their opinions on a new student
service in a group setting.
Before beginning the study, you ask everyone to agree to keep what’s discussed
confidential and to respect each other’s privacy. You also note that you cannot
12
completely guarantee confidentiality or anonymity so that participants are aware of the
risks involved.
As a researcher, you have to consider all possible sources of harm to participants. Harm
can come in many different forms.
A. Psychological harm:
Sensitive questions or tasks may trigger negative emotions such as shame or anxiety.
B. Social harm:
C. Physical harm:
D. Legal harm:
It’s best to consider every possible source of harm in your study as well as concrete ways
to mitigate them. Involve your supervisor to discuss steps for harm reduction.
Make sure to disclose all possible risks of harm to participants before the study to get
informed consent. If there is a risk of harm, prepare to provide participants with resources
or counseling or medical services if needed.
In a study on stress, you survey college students on their alcohol consumption habits.
Some of these questions may bring up negative emotions, so you inform participants
about the sensitive nature of the survey and assure them that their responses will be
confidential.
You also provide participants with information about student counseling services and
information about managing alcohol use after the survey is complete.
6. Results communication.
13
The way you communicate your research results can sometimes involve ethical issues.
Good science communication is honest, reliable, and credible. It’s best to make your
results as transparent as possible.
Take steps to actively avoid plagiarism and research misconduct wherever possible.
Plagiarism
Plagiarism means submitting others’ works as your own. Although it can be unintentional,
copying someone else’s work without proper credit amounts to stealing. It’s an ethical
problem in research communication because you may benefit by harming other
researchers.
Self-plagiarism is when you republish or re-submit parts of your own papers or reports
without properly citing your original work.
This is problematic because you may benefit from presenting your ideas as new and
original even though they’ve already been published elsewhere in the past.
You may also be infringing on your previous publisher’s copyright, violating an ethical
code, or wasting time and resources by doing so.
Example of duplication
You notice that two published studies have similar characteristics even though they are
from different years. Their sample sizes, locations, treatments, and results are highly
similar, and the studies share one author in common.
If you enter both data sets in your analyses, you get a different conclusion compared to
when you only use one data set. Including both data sets would distort your overall
findings.
RESEARCH MISCONDUCT
In 1998, Andrew Wakefield and others published a now-debunked paper claiming that
the measles, mumps, and rubella (MMR) vaccine causes autism in children.
Later investigations revealed that they fabricated and manipulated their data to show a
nonexistent link between vaccines and autism. Wakefield also neglected to disclose
important conflicts of interest, and his medical license was taken away.
This fraudulent work sparked vaccine hesitancy among parents and caregivers. The rate
of MMR vaccinations in children fell sharply, and measles outbreaks became more
common due to a lack of herd immunity.
In reality, there is no risk of children developing autism from the MMR or other vaccines,
as shown by many large studies. Although the paper was retracted, it has actually
received thousands of citations.
Research scandals with ethical failures are littered throughout history, but some took
place not that long ago.
To demonstrate the importance of research ethics, we’ll briefly review two research
studies that violated human rights in modern history.
1. Nazi experiments
Nazi doctors and researchers performed painful and horrific experiments on thousands of
imprisoned people in concentration camps from 1942 to 1945.
15
The participation of prisoners was always forced, as consent was never sought.
Participants often belonged to marginalized communities, including Jewish people,
disabled people, and Roma people.
After some Nazi doctors were put on trial for their crimes, the Nuremberg Code of
research ethics for human experimentation was developed in 1947 to establish a new
standard for human experimentation in medical research.
The Tuskegee syphilis study was an American public health study that violated research
ethics throughout its 40-year run from 1932 to 1972. In this study, 600 young black men
were deceived into participating with a promise of free healthcare that was never fulfilled.
In reality, the actual goal was to study the effects of the disease when left untreated, and
the researchers never informed participants about their diagnoses or the research aims.
Although participants experienced severe health problems, including blindness and other
complications, the researchers only pretended to provide medical care.
When treatment became possible in 1943, 11 years after the study began, none of the
participants were offered it, despite their health conditions and high risk of death.
By the end of the study, 128 participants had died of syphilis or related complications.
The study ended only once its existence was made public and it was judged to be
―medically unjustified.‖
Ethical failures like these resulted in severe harm to participants, wasted resources, and
lower trust in science and scientists. This is why all research institutions have strict
ethical guidelines for performing research.
16