MBA-e Finance-Business Research Methods and Analytics-Unit 4
MBA-e Finance-Business Research Methods and Analytics-Unit 4
SEMESTER-I
All rights reserved. No Part of this book may be reproduced in any form without permission
in writing from TeamLease Edtech Pvt. Ltd.
CONTENT
STRUCTURE
4.2 Introduction
4.5.1 Sample
4.5.5 Census
4.15 Summary
● Help the students to gain an understanding of Data Analysis & Research Report.
4.2 INTRODUCTION
With the rising competition in the market, developing new methodologies to reach the
epitome is needed. Every business person must make their work faster and easier. Along with
it, the main goal when it comes to a business is consumer satisfaction. The best analysis of
consumer satisfaction can be done by taking feedback from the consumers only. It is
impossible to collect input from all people independently, and that is where we need
“sampling.” In sampling, a handful of people called samples would be selected and analysed
deeply as the chosen samples represent the whole population.
Further, the whole population‟s feedback is reported based on the representative samples
chosen from it. In taking feedback and analysing and testing products or services, sampling is
of high importance. In this unit, you will learn about the importance of sampling in business
research. There are various advantages of using the sampling method in businesses: saving
time or saving money, and many more. You will also learn about the types of sampling
methods used in sampling as to select samples from the entire population cannot be done
vaguely. It has to be proper, and with correct explanations as to the sample you will select
would act as a representative to the whole population. You will learn about the way to select
the best sampling method according to your needs. The unit will also explain the errors in
sampling and the methods to reduce or avoid them.
You will also learn about the steps involved in sample designing and distributions in
statistics.
Before starting, it is important to know that statisticians refer to the populations for people
and all the units or items chosen for analysis, according to Levin and Rubin. They use the
word sample for a selected portion for study purposes.
The method of sampling makes the work faster and easier. You will also learn about the
census, sampling procedure, and sampling frame in this unit.
Sampling is a process in which a prespecified number of observations are taken from a large
population. Be it analysis based on people or items. The samples that are used in sampling
are selected on a random basis so that they can represent the whole group.
● Market Research: Expanding the consumer base means finding new market niches. A
market niche is defined as a group of individuals with similar income, gender,
demographics, age, and geographical location. It also includes the individual having
the same education levels and marital status. Sampling a particular niche population
lets the business know whether it is a lucrative prospect and should be pursued or is
just lukewarm.
● New Product Development: The first step is to launch or develop a product, and the
second is to find a market for it. But the other and more important step is to listen to
the market and develop a product that is a solution provider. All the steps benefit
from sampling the market or potential customers. For example, if you are a
manufacturer of chips, you may find that most of the market does not care one way or
the other about the facts contained in it, but a small portion of consumers does. So,
sampling becomes important in business to identify all the types of potential
customers.
● Customer Satisfaction: There are very few businesses that ask all of their customers
whether they are satisfied or not. But that is not possible with a business that has a
large number of customers per day. In this case, sampling the customers is the right
way to find out that the customers are satisfied or not. Sample at different times and
days to get the entire and proper picture. The businessmen use customer sampling as
well. It may make sense to hire an outside firm to conduct sampling and head the
survey so that the customers feel comfortable about telling how they feel and not only
what you want to hear.
Challenges in Sampling:
Undoubtedly, sampling is a very beneficial method for businesses, but it has many
challenges. The population sample has to reflect the entire population; otherwise, the results
are not meaningful. The survey should not lead the customers to answer what they want. The
sampling should be completed within a short period, not within an extended period. As the
sample might change or an event may occur that changes their answers. So, you have to be
very careful while conducting the sampling process.
● Relate the meaning of sampling and its importance in the practical world.
Advantages of sampling:
Sampling is very advantageous in performing business research. The key advantages of using
sampling in business are as follows:
● The method of sampling saves time up to a great extent by reducing the volume of
data. You do not have to go through individual items for the analysis in business.
● It avoids monotony in work and analysis. You do not have to repeat the query
multiple times to all individual data.
● When you have a short time, a survey without sampling becomes impossible. It helps
you to get near-accurate results in much less time.
● When you use proper sampling methods, you are likely to achieve a high level of
accuracy compared to when you do not use sampling due to reduction in monotony,
data handling issues, etc.
● You get detailed information on data even by the usage of a small number of
resources.
● The sampling scope is also high as the investigator is usually concerned with data
generalization, and sampling helps in it.
In statistics, the population is a set of like events or items of interest for the same experiment.
The goal of statistical analysis of people is to find out information about a sample of people
chosen. In statistics, the population may refer to an entire group of people, hospital visits,
objects, events, etc. So, the population can be defined as the aggregate observation of
subjects grouped by any common feature.
1. Finite Population
2. Infinite Population
3. Existent Population
4. Hypothetical Population
1. Finite Population: Finite Population is the type in which the number of people or
entities can be counted. A finite Population is advantageous as compared to an
Infinite Population. Examples of finite Population: Number of students in a
university, the number of employees in a company.
4. Hypothetical Population: The population whose entities or units are not available in
solid form is called Hypothetical Population. Example: Outcome of a tossing coin,
outcomes of a rolling die, etc.
You must be aware of some basic terms like a sample, sampling frame, sampling error, and
sampling size. These terms are described as follows:
4.5.1 Sample
It is a small part that represents the whole group. A sample is taken out when there is a large
quantity or population size, and it is difficult to manage to test or verify all of them.
In more specific words, a sample is a limited or finite part of the statistical population whose
properties are to be studied to gain information about all of them.
For Example: If you need to find out the opinion of students of a university about the
canteen facilities available in the university in a day, it would be impossible to hear the
responses from every student. So here you would select a group of students known as
samples and assume their opinions to be the representative of the entire university students.
A sampling frame is a list of the whole population or items from which the sample is taken
out.
For Example, you might have a sampling frame of names of all students of the school for a
survey you are going to conduct on their economic condition. The units selected in the
sampling frame should have the following properties:
The units selected in the sampling frame should have the following properties:
It is difficult to get a perfect or ideal sampling frame. Sampling frames may have defects too.
Based on these defects, the sampling frames can be divided into the following types:
-Incomplete frames: When some units out of whole units are omitted from the sampling
frames, they are known as incomplete frames.
-Inaccurate frames: When the frame does not represent the data accurately, they are called
inaccurate frames.
-Inadequate Frames: A frame that does not include all population units by its structure is
called an inadequate frame.
-Out of Date Frames: These are the types of frames that are not updated up to date.
Sampling error is the difference between the results of a sample and the results of the whole
population.
Sample size is the number of observations included in the study. It is usually represented by
n. For Example, if you conduct Covid test of 100 Samples of people then your sample size
is 100.
Sample size influences two properties - precision of the estimates we make and the power of
a study to draw various conclusions.
You should also be aware of sampling fraction along with sample size:
Sampling fraction: It is defined as the ratio of the sample size n to the population size N. It is
denoted by n/N
4.5.5 Census
Census is a term that denotes all the units of a population used to describe its features. It
usually refers to the complete enumeration of all persons in the population.
For Example, doing surveys of the economic condition of students in school by asking each
one of them is census.
Merits of the census:
2. Deep Study is possible: as in census, the observations are taken from each
individual. As in Census every person‟s opinion matters so the observations would be
deep.
Demerits:
1. It consumes more time and energy, as it takes the whole population‟s data, so it takes
a lot of time.
2. It is an expensive process.
3. If the sample is not a good representative, the results will not be correct and may lead
to wrong conclusions as the conclusions are dependent on the sample selected.
5. Taking a census is not possible in every situation. It is not always possible to get
expected responses from people.
A sample survey is a survey carried out using a sampling method, i.e., in which a part, not
the whole population, is surveyed. For example, if you want data of the economic condition
of a state so you would take the data from a particular group of people from the state and not
the whole state.
2. The researchers can also analyze the statistical error in their investigation.
1. It does not produce 100% accurate results. The investigator may be biased at the time
of selection of samples that can lead to inaccurate results.
2. When the investigation area is broad, this method is used. For example: testing the
effect of a drug on people by industry.
3. When there are insufficient resources or members, this method can be adopted.
5. When there is no need for high accuracy, this method can be used.
The three basic principles for the design of a sample survey are:
2. Principles of validity: A sample should be selected in such a way that results could be
interpreted objectively in terms of probability. So this principle ensures the validity of the
sample drawn.
3. Principles of statistical regularity: According to this principle, we mean that the number of
items chosen at random from the entire big group is almost sure to possess the large group‟s
characteristics. In other words, the sample should have the features of the population from
which they are drawn.
Principle steps in sample survey:
The main steps in the planning and execution of a sample survey are under the following
heads:
1. Objectives: The objectives of the survey should be defined in a clear and concrete
form. As some of the objectives may be immediate and some may be far-reaching so
the investigator should take care of these objectives with the available resources in
terms of money and manpower required for the survey.
2. Defining the population: The population from which the sample is chosen should be
defined clearly and unambiguous. The geographical boundaries of the population
must be specified appropriately.
3. Sampling frame and sampling units: The sampling units must cover the entire
population, and they must be distinct and non-overlapping.
Once you have defined the sampling units, you must see whether a sampling frame is
available or not, as we know that the sampling frame is a list of all the units in the
population.
1. Selection of Proper sample design: This is the most vital step in planning a sample
survey. The design should take into account the available resources and the time
limit.
2. Method of data collection: For the collection of data for a survey, you must select an
appropriate method. The data can be collected by interview method or the mail
method.
3. Data to be collected: When you start collecting the data, you must ensure that data
should fulfill the objectives. A questionnaire should be in a specified order. The
questions should be clear, brief, collaborative, and non-offending.
5. Summary and Analysis of data: This is the last step where the inference is based on
the collected data. These steps include the following steps:
● The questionnaire conducted should be carefully arranged to find out whether the data
is consistent or not.
● After the data is scrutinized, edited, and tabulated, a deep analysis is done.
1. Resource constraints: If the population is not less enough and it is difficult to cover
the entire population, it is important to identify a subset of the population to work
upon. The group should be carefully defined because that has to be representative of
the whole population. This process is under survey sampling, and it is one of the most
critical processes in survey Design. There are fixed costs associated with the survey,
be it any size of the sample. Once the survey starts, the marginal costs of collecting
more and more information or data from more people are proportional to the chosen
sample size.
Concluding the population: Researchers are interested in samples and apply the conclusions
that come out from that sample to the entire population and carry the process accordingly. A
sample survey provides greater scope as compared to the census. Working within given
resource constraints, sampling may make it possible to study a larger geographical area or
find out more information about the same population by analysing its area in greater depth
through a smaller sample.
The following table shows the differences between a census and a survey:
Census Survey
Step 1: First of all, select a random sample of a specific size from a given population.
Step 2: The second step is to calculate a statistic for the sample, such as mean, standard
deviation, or median.
Step 3: Then, develop a frequency distribution of each sample calculated from the above
step.
Step 4: Then finally, plot the frequency distribution of each sample statistic you developed
from the above step. The graph obtained will be the standard distribution.
Let X be the sample mean of a random sample of size n drawn from a population having
mean µ and standard deviation σ, then the mean of X is
µX = µ and
Theorem:
Let X = 1 / n ∑ (1 to n) Xi
be the sample mean of a random sample of size n drawn from a normal population having
mean µ and standard deviation σ, then X follows an exact normal distribution with mean µ
and standard deviation σ/ √ n.
Theorem (Central Limit Theorem). If X is the mean of a random sample of size n taken from
a population with mean µ and finite variance (σ)sqr, then
Z = X − µ σ/ √ n → n (z; 0,1)
as n → ∞.
In other words, if a random sample of size n is selected from any population with mean µ and
standard deviation σ, then X is approximately N µ, σ/ √ n when n is sufficiently large.
t=(x-mu)/(s/sqrt(n))
In this formula, x is the sample mean, and mu is the population means, signifying standard
deviation.
Normal Distribution:
A normal distribution is also called a bell curve. These are the distributions with
characteristics like a symmetrical bell-shaped curve, and the respective mean and median are
the same number and are at the center of the curve.
Fig 4.1 Distribution curves
The picture below shows the difference between both the curves.
Accuracy and Precision are the two most important properties that a sample should possess.
A good sample is necessary as its results and views are assumed to be the results or views of
many people or items. For that, a good sample must possess the following characteristics:
1. Aim- Oriented: A good sample is always aim-oriented. It should fit the prime
objectives of the research.
4. Random: A good sample should be selected at random, which means every item has
an equal probability to be picked up. It makes the sample truly a good representative
of the whole large group.
5. Simple: Simplicity is one of the main characteristics that a good sample must possess.
A simple design will make it easy for people to understand it. Moreover, it must be
practical too.
1. Companies use the method of sampling to identify the needs of their target
consumers.
2. CPA (Certified Public Accountants) uses this method to conduct various audits to
determine the accuracy of the account balances.
3. Some companies even use sampling to test their products that are to be used in large
quantities.
Methods of sampling:
It is very important to select a suitable method of sampling while doing any experiment or
Analysis. There are mainly two types of sampling methods:
1. Probability sampling
2. Non-probability sampling
Probability sampling: It is a type of sampling method in which the samples are selected at
random. In this method, all the items have an equal probability of being selected. It is mainly
used in quantitative research.
Non-probability sampling: This method is different from the Probability sampling method. In
this, the samples are not selected on a random basis. In this, some items of the population
have a higher probability of being selected. It is mainly used in qualitative research.
4.9 PROBABILITY SAMPLING METHODS
As discussed in the previous section, it is a sampling method in which every item has an
equal probability of becoming a part of the sample. There are mainly 4 types of probability
sampling techniques:
2. Systematic Sampling
3. Stratified Sampling
4. Multi-stage Sampling
In simple Random Sampling, every member or item has an equal probability to be selected.
The only condition is that the sampling frame from which samples are selected should
include the whole population. You can use a random number generator to do so.
For example: If you want to select a random of 50 people from a company. You simply
assign a number to each of them and use a random number generator to select those 50
people.
In this type of sampling method, every member is assigned a specific number, and instead of
randomly selecting the samples, they are chosen at regular intervals. It is more comfortable
than the random sampling method.
The intervals here are called skip intervals. The skip interval can be calculated as follows:
K= N/n
K= skip interval
N= universe size
n=sample size
For example: If you want to select a sample of 5 people from a group of 100 people, you will
assign numbers 1 to 100 to them, and you will select one member from every 20 members. In
this way, you will select 5 people numbered as 20,40,60,80,100.
The risk is that the researcher must take into account how the list unsoiled in sampling is
organized. The sampling interval is critical to be taken care of.
● The main advantage of systematic sampling is its simplicity. Once the intervals are
decided, the sampling can easily proceed.
In the stratified Sampling method, the whole population is divided into subpopulations or
sub-groups that are given the term “stratum” based on any characteristic that may be age,
gender, income, etc. It brings up more precise results as it ensures that every subgroup is
present in the sample. After that, the samples are selected from each subgroup by random
sampling method or systematic Sampling method.
For example, A company has 500 employees who are above 30 in age and 200 employees
who are below 30, and you want to ensure that the sample reflects the company‟s age
balance. So, you divide the whole population into two sub-groups, one group that includes
people with an age of less than 30 and the other having people with more than 30. Then you
use random sampling on each group, selecting 50 people above 30 in age and 20 below 30
years old.
In Cluster Sampling, the population is divided into subgroups, but in this, each subgroup
should have similar Characteristics to the whole sample. The samples are not taken as
individuals, but the entire subgroup is selected as a sample. If the clusters or subgroups are
large, you may also divide the sample individuals from within a cluster using any of the
above methods. This method is good in case of large and dispersed populations, but there are
more chances of errors in this method due to differences between the clusters. The clusters
may not be representing the whole population.
For example, A firm has 20 offices in the entire country. The role of employees is roughly
the same. As it is not possible to go to every office to collect the data, you can use cluster
sampling and use 3 offices to represent the entire data. These 3 offices will act as your
clusters.
Concept of randomization:
You need to know that the samples are not chosen haphazardly. They are chosen in a
systematically random way so that the operation of probability is utilized. In the cases in
which random selection is not possible, other systematic means are used. Randomization is a
sampling method that is used in scientific experiments. Although the word random gives a
feeling of haphazardness when sampling is being performed, the randomness in selecting
units as samples is also very systematic. The concept of randomness is basic in scientific
observation and research. Based on a pre-assumption, although individual units cannot be
predicted with Accuracy, aggregate events can be. Randomness reduces bias up to a great
extent. The basic aim of randomization is to ensure that each treatment is equally likely to be
assigned as a sample in the experiment or research.
● The first application is to select a group of individuals for observations. This group
acts as a representative of the whole population.
1. Convenience Sampling
3. Purposive Sampling
4. Snowball Sampling
Fig 4.3 Types of Non-Probability Sampling
The Convenience sample includes those individuals or entities most accessible to the person
performing the research. It is an easy sampling method. It is economical as well, but it
doesn't ensure that the sample is representative of the population or not, so it cannot produce
accurate results.
For example: Suppose you want to collect feedback about the student support services in
your University, and therefore after each class, you ask your students to give the feedback.
So, this will become an example of convenience Sampling as you take the samples based on
your convenience. It doesn't have students from the entire University but only the students
that you teach.
In this sampling method, the researcher doesn't choose the samples, but the people who
actively participate or volunteer in the research themselves are taken as samples. This is a
biased method as there is a group of people who are more likely to volunteer as compared to
others.
For example, You decide to take a survey at your University. Many students complete it, but
only those who know the survey have strong opinions. But it doesn't give any data about the
students who do not have strong opinions about the same.
This type of sampling is also known as Judgement Sampling. In this, the researcher selects
the samples based on his requirements and expertise. It is mainly used for conducting
qualitative research.
For example: If you want to study or research the hostel services provided by your
University, you would have to select only the hostel Student to study their experiences in the
hostel. This is an example of Purposive Sampling in which samples are selected for a
particular purpose.
If there are many people, it is undoubtedly difficult to manage samples from them. So, on
this, you only select the people you have access to. It is also known as chain sampling.
For example, you want to collect data about homeless people. It is impossible to find all the
homeless people over the entire population so you would select a person and make you keep
other homeless people. This way, you would select samples in the Snowball Sampling
method.
● This allows the researcher to reach a population that is difficult to reach and sample
by using other sampling methods.
● The researcher in this type of sampling has little control over the sampling method.
The subjects that the researcher can obtain mainly rely on the previous subjects
observed.
● The sample‟s representativeness is not ensured as the person who is conducting the
research has no idea of the true distribution of population and even of a sample.
● In this sampling technique, sampling bias is also a fear of researchers. The subjects
may share the same traits and features, so the researcher may get only a small
subgroup of the entire population.
After knowing about all the sampling methods, the next few pages will help you gain deeper
knowledge about sampling and conduct your research in the best way.
Information about the whole Information about the entire population is not available.
population is available.
Probability sampling is costly Non Probability sampling is economical and less time-
and time-consuming. consuming as compared to the Probability Sampling
method.
It is good to study and gain knowledge about all the types of sampling methods. But when
you would start a business, a question would come across your mind: "Which method should
I choose?”. This section will study how to choose the best sampling method according to
your business requirements.
First of all, you need to know that every sampling method works best if applied in the right
place. By the best sampling method, we mean that the method that most effectively meets the
goals of the study or Analysis in question.
The effectiveness of the sampling method depends on many factors. You can use the strategy
given below to select the best method according to your needs:
2. Identify a potential Sampling method that might help you to achieve your goals.
4. Finally, choose the method that does the best job of achieving the aim.
There are mainly two costs that underline sampling analysis: the cost of collecting data and
the cost of an incorrect inference resulting from the data. The researchers must consider the
two causes of incorrect inferences, viz. A systematic bias and sampling error must be
considered by the researchers while doing analysis. Systematic bias is a sampling error that
stems from how research is conducted and controlled by the researcher.
In a suitable sampling frame: if the frame used for reference is not proper, then a biased
representation may occur, resulting in a systematic bias.
The measuring instrument might be physically damaged or old enough to give an incorrect
reading. The data collected from the experiment could hence be misleading, resulting in a
systematic bias.
Due to workers‟ inefficiency in responding properly or giving the correct estimates, there
might be an error in the measurements. The reason being that here the measure of what is to
be estimated is correlated with the level of awareness the worker possesses.
Indeterminacy Principle
An individual behaves differently when kept under strong observation than when they are
kept in non-observed situations. For example, if the workers are informed that the duration in
which they complete the task is being studied and accordingly the quota will be set for piece
work, they tend to work slowly compared to the time they might have taken if they are kept
unobserved. Thus, the indeterminacy principle may therefore lead to systematic bias. Natural
bias in reporting of data
This may be one of the pertinent causes of systematic bias. For example, the income data
collected by the government taxation department shows a downward bias, whereas that
collected by the social organizations shows an upward bias. The reason for these deficiencies
is that people generally report less income for tax purposes and overstate if asked for social
reasons so that they could get the necessary benefits. These factors culminate in the
occurrence of bias.
When the respondents can entirely decide whether to participate in a survey or not, self-
selection is useful when we want to allow units, whether individuals or organizations choose
to take part in the research of their own accord. The surveys may simply be posted on a
website so that anyone visiting or browsing chooses to take the survey, or they may be in
promotion through advertisements or publicized in digital or print media.
For example, the scientists who conduct experiments using human subjects may advertise the
need for volunteers to participate in drug trials or research on physical activity. The key
component is that research subjects or organizations volunteer to take part in their own free
will. They are not approached by the researcher directly.
There may be a wide range of reasons why people and organizations volunteer for such
studies, including having particularly strong opinions about the research, a specific interest in
the study or find, or simply wanting to help out a researcher(s).
The biggest drawback of this kind of survey is that it does not take into account the results of
a larger population. But as such, it does not entirely undermine the usefulness of research.
The web has been a significant tool in reaching out to the unidentified population,
particularly due to their location or population. The probability-based sampling would
unlikely reach them in sufficient numbers. For example, in the example given above, the use
of the web could help in fielding a survey from varied population groups residing in different
parts of the world. The sample received is generalizable, but it could be found consistent with
minimum possible errors.
Also, a researcher named Alverez et al. in 2002 proposed that these types of non-probability
samples can be found useful in designing web pages or web surveys by randomly assigning
members of the sample to control and experimental groups.
Sampling for internet-based surveys involves first identifying a population of internet users
who will volunteer (often they are provided incentives for completing surveys) and then
taking a sample of this population. Organizations such as gift (growth for knowledge),
formerly knowledge networks, recruit internet users as potential survey respondents. When a
client approaches a gift with an internet survey, they will then take a sample of their potential
respondents and invite them to participate in the survey.
Samples of potential respondents are taken in a variety of ways. For the 2012 manseng used
two sampling techniques-random digit dialing (red) and address-based sampling(abs).In the
context of this survey, potential respondents were called via telephone and asked if they
would like to participate in the internet-based surveys. Those who did not possess a laptop or
internet service were provided with the same (laptop was returned to the gift after study).
because pure red sampling can miss some respondents through such things as do not call lists
and caller id devices and also because anis did not want to contact potential respondents via
cell phone.
Any critique of a particular web survey approach must be done in the context of its intended
purpose and the claim it makes. Glorifying or condemning an entire approach to survey data
collection should not be done based on a single implementation, nor should all web surveys
be treated as equal.
Hence, simply because a particular method omits generalizing beyond the sample, it does not
diminish its usefulness and authenticity.
The below table illustrates the strengths and weaknesses associated with different
sampling techniques. With this table’s help, you can easily choose the best sampling
method according to your need and perform the task.
Techniques Strengths Weaknesses
For better understanding, examples of the sampling methods present in the table above
are stated as follows:
1. Convenience Sampling: If you want to take feedback regarding your teaching methods
and you choose all the toppers to be the sample. It would be an example of convenience
sampling as you have chosen the sample as per your convenience.
2. Snowball Sampling: If you want to collect data about homeless people. It is impossible to
find all the homeless people over the entire population so you would select a person and
make you keep other homeless people. This is an example of snowball sampling.
3. Simple Random Sampling: If you want to select a sample of 25 employees from your
company by chit method. Then this is an example of simple random sampling as every
employee has an equal chance to become a sample.
4. Systematic Sampling: If you want to select 10 samples from 100 people and you select
every 10th person then this would be an example of systematic sampling. As you have
introduced a system to select the sample.
5. Stratified Sampling: For Ex 100 students from a school of 1000 were asked about their
favourite subject. It is obvious that all grades have different subjects. So, the students must
be divided according to their grades and the grades would be called strata.
6 250
7 300
8 200
9 100
10 150
Calculate the sample of each grade using stratified random sampling formula:
5.Cluster Sampling: In a survey of students from a city, we first select a sample of schools
then classrooms within the schools and then the sample of students from those classrooms.
So, this way we get a cluster or a group of elements and the survey becomes easier.
Many mistakes may occur in sampling because of biased or misleading sampling. A sample
that is not a good representative of the whole population is called a biased sample. According
to Yule and Kendall, bias may be due to imperfections in the observer‟s instruments,
personal qualities, or defective methods and techniques. As we know, sampling is conducting
an experiment, and it is difficult to eliminate the error, but there is always a scope of
reduction of the same. There are two types of errors noticed in sampling:
1. Sampling Error
n= sample size
We can conclude that when the sample size increases, the error decreases, and when the
variance or dispersion increases the sampling, the error also increases.
Here, p represents the property of successes (favorable responses) and q=1-p represents the
proportion of failures, and n is the sample size or the total number of people who have
responded. Standard Error has the highest value when p=q, which occurs when each is 0.5.
B-Non-Sampling Error:
Before jumping to the determination of sample size, you must be aware of the other sources
of errors in sampling. The other types nor error are:
● Non-Response Error: these errors are caused when the respondents are not
representative of the entire population.
● Data coding Errors: these are the errors that arise in coding or entering the data.
2. The sample size is too big: in this, the whole study may become expensive and
tedious.
So, the sample size should be neither so big nor too small. It is very important to select an
appropriate sample size according to our requirements.
The sample size may be collected based on any of the following- cost, time, or convenience.
Sample size variables: Before calculation of sample size, you need to determine the
following factors:
1. Level of Precision: The level of Precision is also known as Sampling Error. It is the
range in which the true value is estimated to be. It is expressed in percentage. For
example, if a researcher wants to conclude that 50% of students liked the new study
method with a precision rate of 5%, he would say that 45% to 55% of the students
have liked the new study method.
2. Confidence Level: It mainly deals with the level of confidence that the actual mean
falls within your margin of error.
3. Confidence interval or Margin of error: The margin of error is the maximum error
that you have decided to allow in the sampling. It is expressed in terms of mean
numbers. In other words, it is the error that you have agreed to allow between the
mean number of your sample and the mean number of your population.
4. Standard Deviation: Standard Deviation will help you to find out how much your
results vary from each other and the mean number. A high standard deviation means
that all the results are near to the mean number and a low standard deviation means
that the results are spread and are away from the mean number.
1. You can go on selecting small samples based on experience. It can result in wide
confidence intervals.
2. Sample size can also be chosen using a target variance from the samples obtained. In
this high precision is required or narrow confidence intervals.
Calculation of Sample size: Two methods can calculate the sample size-
When the population size is greater than 10000, the following formula is used:
n= sir(z)pp./sir(d)
no= n/(1+n/N)
q=1-p
By using a table:
Table for Determining Minimum Returned Sample Size for a given Population Size for
Continuous and Categorical Data is as follows:
Table 4.4 Calculation of sample size using table method.
For example, if you want to experiment on road safety, then you would select the population-
based on any of the following:
● You can take the people who use two-wheeler vehicles only.
● You can take the rickshaw pullers and the bicycle riders.
So, judging a population or selecting a valid population is the first step in the sampling
process.
● It should be a complete list. It should contain all the elements of the population.
After the first step, the next step is to decide whether to take a census or sample in the
research process. In a census, mostly every member of the population is interviewed, whereas
only the samples are interviewed in sampling. If the census is conducted, we get the data that
is called census parameters, and if sampling is conducted, we get the statistics in a parameter.
Usually, a census is conducted when the population is small, and sampling is conducted
when the population is large.
Step 5: Select the sampling method- Finally, when you have decided to use the probability or
Non- probability sampling method, the last thing is to decide the sampling method.
4.14 PRACTICAL CONSIDERATIONS IN DETERMINING SAMPLE
AND SAMPLE SIZE
When you determine the sample and sample size, the risks around it are one of the following
considerations:
● Sampling risk
1. Confidence and Significance: The terms confidence and Significance used by the
statisticians are related to the sampling risks. These terms increase the convincing
capabilities of the results of the research. If significance and confidence are low, then
the sample results are not that convincing.
Confidence: It is the idea of being certain that the estimates made or the results based on
samples correctly represent the population.
Significance: It is the idea that the results are not due to random chances alone. It is a term
that denotes that there is really convincing evidence based on sample data that there is a
difference.
2. Variance: Variance is one of the most important factors to be considered. The higher
the variance in population, the more samples we need to take to get accurate results.
If the variance is precise and the difference is smaller, more samples are required.
Now we shall study the biases that may occur in selecting a sample. There are different types
of biases that may affect the selection of samples. They are:
3. Survivorship Bias
4. Look-ahead Bias
Biases in sampling:
1. Data Mining Bias: It is the practice of analysing historical data and trends. This
analysis may be used to determine or predict future behaviour. But it has many
limitations: it gives rise to statistically irrelevant data sometimes, and sometimes the
trends it predicts may be irrelevant.
2. Sample Selection Bias: It is also a type of bias that occurs in sampling. A part of the
population is removed because of the unavailability of data. So obviously, the
resulting sample is not representative of the whole population.
3. Survivorship Bias: This is the type of bias in which the information is related to non-
existent financial vehicles. For example, most of the databases of mutual funds
exclude the funds that have underperformed. The conclusions may underestimate or
overestimate the population parameters.
4. Look-ahead bias: This type of bias occurs when data that is not available at the time
of research is used in a simulation of that time. Research with this bias would not
show accurate results.
4.15 SUMMARY
Sampling is undoubtedly a very productive process. It saves a lot of time and effort in
business research and many other activities. Sampling is a process used in statistical analysis
in which several samples are taken from a larger population. The method used to find out a
sample is based on the requirements of the researcher. Sampling has many advantages. It is
an economical way of carrying on business research. It saves a lot of time. Also, in statistics,
a population is a set of events or items that are of interest to the same experiment or research.
There are four categories of population in statistics: Finite Population, infinite Population,
existent population, and hypothetical population. There are two types of sampling:
probability sampling and non-probability sampling. Probability sampling is the type in which
the samples are collected on a random basis. Every unit has an equal probability of becoming
a sample. In non-probability sampling, the samples are not collected on a random basis.
These units do not have an equal probability of becoming a sample. The probability sampling
can be divided further into the following types- simple random sampling, stratified sampling,
cluster sampling, and systematic sampling. Non- probability sampling can be divided into-
convenience sampling, voluntary response sampling, Purposive Sampling, and snowball
sampling. A good sample‟s characteristics are that it should be accurate, aim-oriented,
proportional, random, simple, economical, and clear. A sampling frame is the list of units of
the entire population. A good sampling frame should be relevant and accurate and should
cover the whole population. Sampling frames may have many defects according to that it can
be classified into the following categories: incomplete, inaccurate, inadequate, and out of
date. The errors in sampling include sampling errors and non-Sampling errors. There may be
many other errors like measurement Errors, Non- response errors, and data coding errors.
The sample size can be calculated by using a formula or a table too. The sample design
process includes the following steps: Defining the population, constructing an appropriate
sampling frame, deciding whether to take a sample or census, selecting the category of
sampling to be used and lastly, deciding the sampling method. It is important to take valid
considerations in sampling: Confidence and Significance, selecting a meaningful sample size
and variance. There are four types of biases possible in sampling: data mining bias, sample
Selection bias, survivorship Bias and look-ahead bias. The sampling method has obtained
great importance. The sample studies are becoming more and more popular. The vastness of
the population, the problems that occur in contacting many people, high refusal rates,
difficulties in ascertaining the universe make sampling the best alternative in all the research
and social studies. The recent developments in this method are more reliable. The results also
have high accuracy. The data collection becomes very easy by using sampling.
A. Descriptive Questions
1. Sample
2. Sample size
3. Sampling frame
a. Explain the steps involved in the sample design process and the practical considerations in
sampling and sample size?
b. How to determine the sample size? Explain by both table method and formula method.
Scenario-based questions:
c. If you start a CT bank business and to ensure the supply of non-defected Ct banks, you
decide to test them. How would you apply sampling to this?
e. What, according to you, is the best sampling method for a survey that is to be conducted
online?
5. If you want to design a questionnaire for a survey on road safety. Describe your strategy
and your method of implementation.
a. Quota Sampling
b. Convenience sampling
c. Snowball sampling
d. Stratified sampling
a. Distribution
b. Population
c. Data
d. Set
3. If the sample size increases, what would happen to the sampling error?
a. It would increase
b. It would reduce
c. No effect on it
a. Probability
b. Sampling error
c. Random
d. Non-random
a. Probability Sampling
b. Stratified sampling
d. Systematic Sampling
c. Difficult to justify
d. Once
a. Sampling design
b. Sampling frame
c. Sample
d. Sampling distribution
a. Sample size
b. Sampling frame
c. Sampling error
d. Sampling distribution
10. We use census when the population is limited. Is this statement true or false?
a. True
b. False
Answers:
C. Test Yourself
2. The sampling frame for any probability sample is a complete list of all the cases in
the population from which your sample has been drawn.
3. A perfect sample representative sample is one that exactly represents the population
from which it is drawn.
Answers:
Reference Books
● Cochran, William G. (1977) Sample Techniques, New York: John Wiley
● Haber, A, and R.P. Runyon,1972, General Studies, Addison-Wesley: USA
● Toll, DS and G.S. Album 1973, Survey research a Decisional Approach, Intext
Educational Publishers: USA.
● Kothari, C.R. 1985, Research Methodology Methods and Techniques, Wiley Eastern:
New Delhi
● Singh, D. and F.S. Chaudhary, 1986, Theory and analysis of sample survey designs,
Wiley Eastern: New Delhi
● Kripri, K. and Gallagher, S.J. (2003) „Incentives to increase participation in an
internet survey of alcohol use: a controlled experiment,‟ Alcohol and Alcoholism,
38(5): 437:441
● Sheehan, K.B. (1999) „Using e-mail to survey internet users in the United States:
methodology and assessment,‟ Journal of Computer-Mediated Communication, (4)3.
Accessed online at http://jcmc.indiana.edu/vol14/issue3/sheehan.html on July 6, 2006
● Mail and internet surveys: The tailored Design Method by Dillman (2007 edition) and
survey errors and Survey cost by Groves (1989).
● Ackoff, R.L. 1953. The design of social research, Chicago, University of Chicago
Press.
● Zikmund 2002, Business research methods, Dryden, Thomson Learning.