0% found this document useful (0 votes)
5 views

Data Collection and Processing

Data collection is a crucial step in research, involving the gathering and analysis of information through various methods such as primary and secondary data. Primary data is collected directly from sources and is considered more reliable, while secondary data involves previously gathered information. Different methods for data collection include observation, experimentation, interviews, and surveys, each with its own advantages and limitations.

Uploaded by

shwetayadav.1889
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Data Collection and Processing

Data collection is a crucial step in research, involving the gathering and analysis of information through various methods such as primary and secondary data. Primary data is collected directly from sources and is considered more reliable, while secondary data involves previously gathered information. Different methods for data collection include observation, experimentation, interviews, and surveys, each with its own advantages and limitations.

Uploaded by

shwetayadav.1889
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 67

DATA

COLLECTION
AND
PROCESSING
UNIT 3
DATA COLLECTION

Data collection is defined as the procedure of collecting, measuring and


analyzing accurate insights for research using standard validated techniques.

A researcher can evaluate their hypothesis on the basis of collected data.

In most cases, data collection is the primary and most important step for
research, irrespective of the field of research. The approach of data collection is
different for different fields of study, depending on the required information.
TYPES OF DATA

PRIMARY DATA SECONDARY DATA


PRIMARY DATA
Primary data is a type of data that is
collected by researchers directly from
main sources through interviews,
surveys, experiments, etc. Primary data
are usually collected from the source—
where the data originally originates from
and are regarded as the best kind of
data in research.
• An organization doing market research about a
new product (say phone) they are about to
release will need to collect data like purchasing
power, feature preferences, daily phone usage,
Example etc. from the target market. The data from past
surveys are not used because the product
differs.
FEATURES OF PRIMARY DATA

FIRST HAND INFORMATION

TIME CONSUMING

EXPENSIVE

PAPER WORK AND DOCUMENTATION

VARIOUS METHODS

AVAILIBILITY

RELIABILITY

ACCURACY
OBJECTIVES OF RESEARCH
FACTORS TIME
INFLUENCING
PRIMARY COST
DATA AVAILABILITY OF RESEARCH STAFF
COLLECTION
AVAILIBILITY OF RESPONDENTS
OBSERVATION METHOD

METHOD
OF EXPERIMENTATION METHOD

COLLECTIN
G PRIMARY SURVEY METHOD

DATA
INTERVIEW METHOD
OBSERVATION
METHOD

• Observation, as the name


implies, is a way of collecting
data
through observing. Observa
tion data
collection method is
classified as a participatory
study, because the
researcher has to immerse
herself in the setting where
her respondents are, while
taking notes and/or recording.
Simplest way

Use for framing hypothesis

Advantage Greater accuracy

s Universal method

Useful where verbal information fails

Independent of people willingness to give


information
Observation requires huge time

It is expensive as staffs are required to


Disadvanta be trained and record the observation.

ges
It may not give complete information

Observation may be judgemental


Structured and Unstructured

Types of
Observatio
Disguised and Undisguised
n

Mechanical
Experimentation
Method
•The experimental method involves the
manipulation of variables to establish cause and
effect relationships. The key features are controlled
methods and the random allocation of participants
into controlled and experimental groups.

•An experiment is an investigation in which a


hypothesis is scientifically tested. In an experiment,
an independent variable (the cause) is manipulated
and the dependent variable (the effect) is measured;
any extraneous variables are controlled.
It provides first hand
information

Advantages It gives reliable and relevant


information

It helps in developing new


techniques and method
Expensive

Disadvanta
ges
Time consuming

Unsuccessful or delayed result


Field Experiment

Types of
Experimentati
Lab Experiment
on

Natural Experiment
Interview
Method

• An interview is generally a qualitative


research technique which involves
asking open-ended questions to
converse with respondents and collect
elicit data about a subject.
• Interviews are conducted with a
sample from a population and the key
characteristic they exhibit is their
conversational tone.
• Focused Group Interview
• In-depth Interview
Advantages

RELIABILITY DETAILED HELPS IN FLEXIBILITY PERSONAL TOUCH


INFORMATION HYPOTHESIS
FORMULATION
Disadvantages

• Time consuming
• Expensive
• Documentation and Paperwork
• Respondent and Interviewer
biasedness
• Sampling problem
Types of Interview

Focused Group
Personal Interview Interview

• Line of thought of the


• Formal and Informal
interviewer
• Structured and
Unstructured
• Individual and Group
• General or specific
interview
Survey Method

A survey is a research method used


for collecting data from a predefined
group of respondents to gain
information and insights into various
topics of interest. They can have
multiple purposes, and researchers
can conduct it in many ways
depending on the methodology chosen
and the study's goal.
Telephone

Types Mail

Internet
Schedules
The schedule is a formalized
set of questions, statements,
and spaces for answers,
provided to the enumerators
who ask questions to the
respondents and note down
the answers. While
a questionnaire is filled by the
informants themselves,
enumerators fill
the schedule on behalf of the
respondent.
To provide a standardized tool
for observation

Purpose To act as memory tickler

To facilitate the work of


tabulation and analysis
Rating Schedule

Documents Schedule
Types of Survey Schedule
Schedules
Observation Schedule

Structured or Unstructured
Study all aspects of problem

Clarity

Framing of Sequencing of Questions

a Schedule
Pre-testing of Schedule

Division of Schedule

Appropriate form of questionnaires


Personal Identity of
Biasedness
Contact Respondents

Nature of Use of
Features Respondents Computers
Time

Response Area of
Cost
Rate Coverage
• A questionnaire is a research instrument that
consists of a set of questions or other types of
prompts that aims to collect information from
a respondent. A research questionnaire is
typically a mix of close-ended
questions and open-ended questions. Open-
ended, long-form questions offer the
Questionna respondent the ability to elaborate on their
thoughts. Research questionnaires were
ire developed in 1838 by the Statistical Society of
London.
• The data collected from a data collection
questionnaire can be both qualitative as well
as quantitative in nature. A questionnaire may
or may not be delivered in the form of
a survey, but a survey always consists of a
questionnaire.
Importance of
Questionnaire
• It collects view point of people
• More data can be collected
• It gives a summary of demographic situation
• Less time consuming
• Study the behaviour
• It collects sensitive information
• It creates a data base
Essentials of Good
Questionnaire
Relevant Questions
Clarity
Restricted no. of Questions
Type of Questions: Open and Close ended
Sequence of Questions
Pilot Study
Data collected is up-to- Relevant and specific to
date your research objectives.

Advantages
of Primary Primary research can
Competitors have no
access to your data,

Data deliver ‘trade secrets’ giving you a competitive


edge.

It’s possible to conduct


To be able to apply those low-level market
findings to the entire research, such as an
market. online survey, cheaply
and easily.
It can be expensive

Limitations Time-consuming and take a long time to


complete if it involves face-to-face
of Primary contact with customers.

Data It requires some prior information about


the subject, and ideally market research
skills to get the best results.

Attracting enough customers to take


part in your survey, especially when
doing it yourself, can be challenging.
Secondary Secondary research means
research that has previously been
Research/Dat undertaken, usually by another
business or organisation, but is
a publicly available for free (such as
government statistics) or paid-for
(such as a research paper by an
organization).
Secondary Gathering previously
Based on already
research is researched information
analysed and interpreted
information and data
characterized
by:
Use of data that has been
Same data being
collected by someone
available to both you and
else other than the
your competitors;
researcher;

Fast and easy, ideal for


gaining a broad
Immediate data
understanding of a
availability
market quickly and
cheaply.
Sources of
Secondary
Data
Supplement It could be
Primary less

Advantag
Data expensive.

es of Quick Decision
Less Time
consuming

Secondar Less

y Data
No Sampling
Processing
Errors
of Data

Large
volume of
Data
Problem of Accuracy, and Reliability

Problem of Adequacy

Limitations
Lack of In-depth information
of
Secondary Lack of potential in handling specific problem

Data
Problem of Biased information

May involve huge cost


TASK
Write down the difference between Primary data and Secondary data.
Sampling
Significanc TIME SAVING LESS COMPLEX DETAILED CONVENIENT TO

e of INFORMATION CAN BE
COLLECTED
RESEARCHER

Sampling
ECONOMICAL SUITABLE FOR QUALITY RESEARCH
ACADEMIC AND WORK
MARKET-BASED
RESEARCH
Method of
Sampling

Non-
Probability
Probability
Method
Method

Probability sampling is a sampling technique where a In non-probability sampling, the researcher chooses
researcher sets a selection of a few criteria and chooses members for research at random. This sampling method
members of a population randomly. All the members have is not a fixed or predefined selection process. This makes
an equal opportunity to be a part of the sample with this it difficult for all elements of a population to have equal
selection parameter opportunities to be included in a sample.
• It is a reliable method of obtaining
information where every single member
of a population is chosen randomly,
merely by chance. Each individual has
the same probability of being chosen to
be a part of a sample.
For example, in an organization of 500
Simple Random employees, if the HR team decides on
Sampling conducting team building activities, it is
highly likely that they would prefer
picking chits out of a bowl. In this case,
each of the 500 employees has an equal
opportunity of being selected.
• Lottery Method
• Random Tables
• Researchers use the systematic sampling
method to choose the sample members of a
population at regular intervals. It requires
the selection of a starting point for the
sample and sample size that can be
Systemati repeated at regular intervals. This type of
sampling method has a predefined range,
c and hence this sampling technique is the
Sampling least time-consuming.
For example, a researcher intends to collect
a systematic sample of 500 people in a
population of 5000. He/she numbers each
element of the population from 1-5000 and
will choose every 10th individual to be a
part of the sample (Total population/
Sample Size = 5000/500 = 10).
• Cluster sampling is a method where the
researchers divide the entire population
into sections or clusters that represent a
population. Clusters are identified and
included in a sample based on
demographic parameters like age, sex,
location, etc. This makes it very simple
for a survey creator to derive effective
inference from the feedback.
Cluster
Sampling • For example, if the United States
government wishes to evaluate the
number of immigrants living in the
Mainland US, they can divide it into
clusters based on states such as
California, Texas, Florida, Massachusetts,
Colorado, Hawaii, etc. This way of
conducting a survey will be more
effective as the results will be organized
into states and provide insightful
immigration data.
Stratified random sampling is a method in which
the researcher divides the population into smaller
groups that don’t overlap but represent the
entire population. While sampling, these groups
can be organized and then draw a sample from
each group separately.

For example, a researcher looking to analyze the


characteristics of people belonging to different
Stratified annual income divisions will create strata
Sampling (groups) according to the annual family income.
Eg – less than Rs. 2,00,000, Rs. 2,00,000 – Rs.
4,00,000, Rs. 4,00,000 to Rs. 6,00,000, Rs.
6,00,000 to Rs. 8,00,000, etc. By doing this, the
researcher concludes the characteristics of
people belonging to different income groups.
Marketers can analyze which income groups to
target and which ones to eliminate to create a
roadmap that would bear fruitful results.
Uses of
probabil Reduce Sample Bias
ity
samplin
g Diverse Population

Create an Accurate
Sample
Convenience Sampling

• When the data collection is dependent on ease of access to


data is termed as convenience sampling. This non-
probability sampling method is used when there are time and
cost limitations in collecting feedback. In situations where
there are resource limitations such as the initial stages of
research, convenience sampling is used.
For example, startups and NGOs usually conduct
convenience sampling at a mall to distribute leaflets of
upcoming events or promotion of a cause – they do that by
standing at the mall entrance and giving out pamphlets
randomly.
Judgmental Sampling

• Judgemental or purposive samples are formed by the


discretion of the researcher. Researchers purely consider the
purpose of the study, along with the understanding of the
target audience. For instance, when researchers want to
understand the thought process of people interested in
studying for their master’s degree. The selection criteria will
be: “Are you interested in doing your masters in …?” and
those who respond with a “No” are excluded from the
sample.
Accidental Sampling

• Accidental sampling (sometimes known as grab,


convenience sampling or opportunity sampling) is a type of
nonprobability sampling which involves the sample being drawn from
that part of the population which is close to hand. That is,
a sample population selected because it is readily available and
convenient.
Quota Sampling

• In Quota sampling, the selection of members in this


sampling technique happens based on a pre-set standard. In
this case, as a sample is formed based on specific attributes,
the created sample will have the same qualities found in the
total population. It is a rapid method of collecting samples.
Snowball Sampling

Snowball sampling is a sampling method that researchers


apply when the subjects are difficult to trace. For example, it
will be extremely challenging to survey shelterless people or
illegal immigrants. In such cases, using the snowball theory,
researchers can track a few categories to interview and derive
results. Researchers also implement this sampling method in
situations where the topic is highly sensitive and not openly
discussed—for example, surveys to gather information about
HIV Aids. Not many victims will readily respond to the
questions. Still, researchers can contact people they might
know or volunteers associated with the cause to get in touch
with the victims and collect information.
Uses of non-probability sampling

Create a Exploratory Budget and


hypothesis research time constraints
Task

Difference between probability


sampling and non-probability
sampling methods.
Area of Availability of Availability of
Research Funds Manpower
Factors
determinin Nature of Method of
Time Frame
g Sample Research Sampling

Size Method of Judgement of


Data the Accuracy
Collection Researcher
Sample that provide correct and quality
information.
Goal oriented

Good Simple and practical


Sampling
Random selection and variability of data

Suitability
Data Processing
Data processing is the method
of collecting raw data and
translating it into usable
information. It is usually
performed in a step-by-step
process by a team of data
scientists and data
engineers in an organization.
The raw data is collected,
filtered, sorted, processed,
analyzed, stored and then
presented in a readable format.
Stages in Data Processing

Graphic
Editing Coding Classification Tabulation
Presentation
Editing of data
Editing is the first step of data processing. Editing
is the process of examine the data collected
through questionnaire or any other method. It
start after all data collection to check it or reform
into useful data.
Coding is the process of categories
data according to research subject
or topic and the design of research.
In coding process researcher set a
code for a particular things like
Coding of male - M, Female- F that indicate
the gender in questionnaire without
data writing full spelling same as
researcher can be use colors to
highlight something or numbers like
1+, 1-. this type of coding makes
easy to calculate or evaluate result
in tabulation.
Classification or categorization is the
process of grouping the statistical data
under various understandable
homogeneous groups for the purpose of
convenient interpretation. A uniformity of
attributes is the basic criterion for
classification; and the grouping of data is
Classificati made according to similarity.
Classification becomes necessary when
on of Data there is a diversity in the data collected
for meaningless for meaningful
presentation and analysis. However, it is
meaningless in respect of homogeneous
data. A good classification should have
the characteristics of clarity,
homogeneity, equality of scale,
purposefulness and accuracy.
Tabulation
of data
Tabulation is the process of summarizing raw
data and displaying it in compact form for
further analysis. Therefore, preparing tables is
a very important step. Researcher can be
tabulation by hand or in digital mode. The
choice is made largely based on the size and
type of study, alternative costs, time
pressures, and the availability of computers,
and computer programmes. If the number of
questionnaire is small, and their length short,
hand tabulation is quite satisfactory.
Diagrams are charts and graphs used
to present data. These facilitate
getting the attention of the reader
more. This help present data more
effectively. Creative presentation of
Graphical data is possible. The data diagrams are
classified into:
Representat Pie Chart
ion Bar Graphs
Line Graphs
Gantt Charts
Histograms
Gantt Chart

You might also like