This document outlines the concepts of population, sample, and various sampling methods used in research. It explains the importance of selecting a representative sample and the differences between probability and non-probability sampling techniques. Additionally, it discusses the implications of sampling errors and biases on research validity.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
22 views70 pages
Sampling Techniques
This document outlines the concepts of population, sample, and various sampling methods used in research. It explains the importance of selecting a representative sample and the differences between probability and non-probability sampling techniques. Additionally, it discusses the implications of sampling errors and biases on research validity.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70
At the end of the lesson, I am expected to:
1. Define population, sample, probability and
non-probability sampling; 2. Identify the kind of sampling needed for the study; and 3. Calculate the sample needed for the study. A population is the entire group that you want to draw conclusions about.
A sample is the specific group that you will
collect data from. The size of the sample is always less than the total size of the population. In research, a population doesn’t always refer to people. It can mean a group containing elements of anything you want to study, such as objects, events, organizations, countries, species, organisms, etc. Populations are used when your research question requires, or when you have access to, data from every member of the population. Usually, it is only straightforward to collect data from a whole population when it is small, accessible and cooperative. Collecting data from a population A high school administrator wants to analyze the final exam scores of all graduating seniors to see if there is a trend. Since they are only interested in applying their findings to the graduating seniors in this high school, they use the whole population dataset. For larger and more dispersed populations, it is often difficult or impossible to collect data from every individual. For example, every 10 years, the Philippine government aims to count every person living in the country using the Census. This data is used to distribute funding across the nation. However, historically, marginalized and low-income groups have been difficult to contact, locate and encourage participation from. Because of non- responses, the population count is incomplete and biased towards some groups, which results in disproportionate funding across the country.
In cases like this, sampling can be used to make more
precise inferences about the population. When your population is large in size, geographically dispersed, or difficult to contact, it’s necessary to use a sample. With statistical analysis, you can use sample data to make estimates or test hypotheses about population data. You want to study political attitudes in young people. Your population is the 300,000 undergraduate students in the Philippines. Because it’s not practical to collect data from all of them, you use a sample of 300 undergraduate volunteers from three Philippine universities who meet your inclusion criteria. This is the group who will complete your online survey. Ideally, a sample should be randomly selected and representative of the population. Using probability sampling methods (such as simple random sampling or stratified sampling) reduces the risk of sampling bias and enhances both internal and external validity. For practical reasons, researchers often use non- probability sampling methods. Non-probability samples are chosen for specific criteria; they may be more convenient or cheaper to access. Because of non-random selection methods, any statistical inferences about the broader population will be weaker than with a probability sample. Necessity Practicality Cost-effectiveness Manageability Necessity: Sometimes it’s simply not possible to study the whole population due to its size or inaccessibility.
Practicality: It’s easier and more efficient to
collect data from a sample. Cost-effectiveness: There are fewer participant, laboratory, equipment, and researcher costs involved.
Manageability: Storing and running statistical
analyses on smaller datasets is easier and reliable. When you collect data from a population or a sample, there are various measurements and numbers you can calculate from the data. A parameter is a measure that describes the whole population. A statistic is a measure that describes the sample. You can use estimation or hypothesis testing to estimate how likely it is that a sample statistic differs from the population parameter. In your study of students’ political attitudes, you ask your survey participants to rate themselves on a scale from 1, very liberal, to 7, very conservative. You find that most of your sample identifies as liberal – the mean rating on the political attitudes scale is 3.2. You can use this statistic, the sample mean of 3.2, to make a scientific guess about the population parameter – that is, to infer the mean political attitude rating of all undergraduate students in the Philippines. A sampling error is the difference between a population parameter and a sample statistic. In your study, the sampling error is the difference between the mean political attitude rating of your sample and the true mean political attitude rating of all undergraduate students in the Philippines. Sampling errors happen even when you use a randomly selected sample. This is because random samples are not identical to the population in terms of numerical measures like means and standard deviations.
Because the aim of scientific research is to generalize
findings from the sample to the population, you want the sampling error to be low. You can reduce sampling error by increasing the sample size. When you conduct research about a group of people, it’s rarely possible to collect data from every person in that group. Instead, you select a sample. The sample is the group of individuals who will actually participate in the research. To draw valid conclusions from your results, you have to carefully decide how you will select a sample that is representative of the group as a whole. This is called a sampling method. There are two primary types of sampling methods that you can use in your research: Probability sampling involves random selection, allowing you to make strong statistical inferences about the whole group. Non-probability sampling involves non-random selection based on convenience or other criteria, allowing you to easily collect data. You should clearly explain how you selected your sample in the methodology section of your paper or thesis, as well as how you approached minimizing research bias in your work. First, you need to understand the difference between a population and a sample, and identify the target population of your research. The population is the entire group that you want to draw conclusions about. The sample is the specific group of individuals that you will collect data from. The population can be defined in terms of geographical location, age, income, or many other characteristics. It can be very broad or quite narrow: maybe you want to make inferences about the whole adult population of your country; maybe your research focuses on customers of a certain company, patients with a specific health condition, or students in a single school. It is important to carefully define your target population according to the purpose and practicalities of your project. If the population is very large, demographically mixed, and geographically dispersed, it might be difficult to gain access to a representative sample. A lack of a representative sample affects the validity of your results, and can lead to several research biases, particularly sampling bias. The sampling frame is the actual list of individuals that the sample will be drawn from. Ideally, it should include the entire target population (and nobody who is not part of that population). You are doing research on working conditions at a social media marketing company. Your population is all 1000 employees of the company. Your sampling frame is the company’s HR database, which lists the names and contact details of every employee. The number of individuals you should include in your sample depends on various factors, including the size and variability of the population and your research design. There are different sample size calculators and formulas depending on what you want to achieve with statistical analysis. Probability sampling means that every member of the population has a chance of being selected. It is mainly used in quantitative research. If you want to produce results that are representative of the whole population, probability sampling techniques are the most valid choice. In a simple random sample, every member of the population has an equal chance of being selected. Your sampling frame should include the whole population. To conduct this type of sampling, you can use tools like random number generators or other techniques that are based entirely on chance. EXAMPLE: You want to select a simple random sample of 1000 employees of a social media marketing company. You assign a number to every employee in the company database from 1 to 1000, and use a random number generator to select 100 numbers. SYSTEMATIC SAMPLING is similar to simple random sampling, but it is usually slightly easier to conduct. Every member of the population is listed with a number, but instead of randomly generating numbers, individuals are chosen at regular intervals. EXAMPLE: All employees of the company are listed in alphabetical order. From the first 10 numbers, you randomly select a starting point: number 6. From number 6 onwards, every 10th person on the list is selected (6, 16, 26, 36, and so on), and you end up with a sample of 100 people. If you use this technique, it is important to make sure that there is no hidden pattern in the list that might skew the sample. For example, if the HR database groups employees by team, and team members are listed in order of seniority, there is a risk that your interval might skip over people in junior roles, resulting in a sample that is skewed towards senior employees. Stratified sampling involves dividing the population into subpopulations that may differ in important ways. It allows you draw more precise conclusions by ensuring that every subgroup is properly represented in the sample. To use this sampling method, you divide the population into subgroups (called strata) based on the relevant characteristic (e.g., gender identity, age range, income bracket, job role). Based on the overall proportions of the population, you calculate how many people should be sampled from each subgroup. Then you use random or systematic sampling to select a sample from each subgroup. EXAMPLE: The company has 800 female employees and 200 male employees. You want to ensure that the sample reflects the gender balance of the company, so you sort the population into two strata based on gender. Then you use random sampling on each group, selecting 80 women and 20 men, which gives you a representative sample of 100 people. Cluster sampling also involves dividing the population into subgroups, but each subgroup should have similar characteristics to the whole sample. Instead of sampling individuals from each subgroup, you randomly select entire subgroups. If it is practically possible, you might include every individual from each sampled cluster. If the clusters themselves are large, you can also sample individuals from within each cluster using one of the techniques above. This is called multistage sampling. This method is good for dealing with large and dispersed populations, but there is more risk of error in the sample, as there could be substantial differences between clusters. It’s difficult to guarantee that the sampled clusters are really representative of the whole population. The company has offices in 10 cities across the country (all with roughly the same number of employees in similar roles). You don’t have the capacity to travel to every office to collect your data, so you use random sampling to select 3 offices – these are your clusters. In a non-probability sample, individuals are selected based on non-random criteria, and not every individual has a chance of being included. This type of sample is easier and cheaper to access, but it has a higher risk of sampling bias. That means the inferences you can make about the population are weaker than with probability samples, and your conclusions may be more limited. If you use a non-probability sample, you should still aim to make it as representative of the population as possible. Non-probability sampling techniques are often used in exploratory and qualitative research. In these types of research, the aim is not to test a hypothesis about a broad population, but to develop an initial understanding of a small or under- researched population. A convenience sample simply includes the individuals who happen to be most accessible to the researcher. This is an easy and inexpensive way to gather initial data, but there is no way to tell if the sample is representative of the population, so it can’t produce generalizable results. Convenience samples are at risk for both sampling bias and selection bias. EXAMPLE: You are researching opinions about student support services in your university, so after each of your classes, you ask your fellow students to complete a survey on the topic. This is a convenient way to gather data, but as you only surveyed students taking the same classes as you at the same level, the sample is not representative of all the students at your university. Similar to a convenience sample, a voluntary response sample is mainly based on ease of access. Instead of the researcher choosing participants and directly contacting them, people volunteer themselves (e.g. by responding to a public online survey). Voluntary response samples are always at least somewhat biased, as some people will inherently be more likely to volunteer than others, leading to self-selection bias. EXAMPLE: You send out the survey to all students at your university and a lot of students decide to complete it. This can certainly give you some insight into the topic, but the people who responded are more likely to be those who have strong opinions about the student support services, so you can’t be sure that their opinions are representative of all students. This type of sampling, also known as judgement sampling, involves the researcher using their expertise to select a sample that is most useful to the purposes of the research. It is often used in qualitative research, where the researcher wants to gain detailed knowledge about a specific phenomenon rather than make statistical inferences, or where the population is very small and specific. An effective purposive sample must have clear criteria and rationale for inclusion. Always make sure to describe your inclusion and exclusion criteria and beware of observer bias affecting your arguments. EXAMPLE: You want to know more about the opinions and experiences of disabled students at your university, so you purposefully select a number of students with different support needs in order to gather a varied range of data on their experiences with student services. If the population is hard to access, snowball sampling can be used to recruit participants via other participants. The number of people you have access to “snowballs” as you get in contact with more people. The downside here is also representativeness, as you have no way of knowing how representative your sample is due to the reliance on participants recruiting others. This can lead to sampling bias. EXAMPLE: You are researching experiences of homelessness in your city. Since there is no list of all homeless people in the city, probability sampling isn’t possible. You meet one person who agrees to participate in the research, and she puts you in contact with other homeless people that she knows in the area. Pritha Bhandari (2020), Population vs. Sample | Definitions, Differences & Examples https://www.scribbr.com/methodology/sampling- methods/