0% found this document useful (0 votes)
15 views10 pages

Data Collection Lecture

Data collection is the systematic process of gathering and evaluating information from various sources to address research questions and inform decision-making. It involves identifying data types, sources, and methods, and can be categorized into primary and secondary data collection techniques. The integrity of data collection is crucial for effective analysis and decision-making, with challenges such as data quality issues and ambiguity needing to be addressed.

Uploaded by

eddysmart65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views10 pages

Data Collection Lecture

Data collection is the systematic process of gathering and evaluating information from various sources to address research questions and inform decision-making. It involves identifying data types, sources, and methods, and can be categorized into primary and secondary data collection techniques. The integrity of data collection is crucial for effective analysis and decision-making, with challenges such as data quality issues and ambiguity needing to be addressed.

Uploaded by

eddysmart65
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

What is Data Collection?

Data collection is the process of collecting and evaluating information or


data from multiple sources to find answers to research problems, answer
questions, evaluate outcomes, and forecast trends and probabilities. It is an
essential phase in all types of research, analysis, and decision-making,
including that done in the social sciences, businesses, and healthcare.

During data collection, researchers must identify the data types, the
sources of data, and the methods being used. We will soon see that there
are many different

Before an analyst begins collecting data, they must answer three


questions first:

 What’s the goal or purpose of this research?


 What kinds of data are they planning on gathering?
 What methods and procedures will be used to collect, store, and process
the data into information?
Data can be divided into two types qualitative and quantitative.

Qualitative data covers descriptions such as color / complexion, age, size,


name and appearance etc.. While,

Quantitative data deals with numbers, such as statistics, poll numbers,


percentages, Averages etc.

Why Do We Need Data Collection?

The concept of data collection isn’t new, and has really helped to change
and shape the world to the stage it is right now, with the aid of technology
data collection has been made super easy, and individuals have access to
various data at any point in time.

Different Methods of Data Collection

1. Primary Data Collection


The first techniques of data collection is Primary data collection which
involves the collection of original data directly from the source or
through direct interaction with the respondents. This method allows
researchers to obtain firsthand information tailored to their research
objectives. There are various techniques for primary data collection,
including:

 Surveys and Questionnaires: Researchers design structured


questionnaires or surveys to collect data from individuals or
groups. These can be conducted through face-to-face
interviews, telephone calls, mail, or online platforms.

 Interviews: Interviews involve direct interaction between the


researcher and the respondent. They can be conducted in
person, over the phone, or through video conferencing.
Interviews can be structured (with predefined questions), semi-
structured (allowing flexibility), or unstructured (more
conversational).

 Observations: Researchers observe and record behaviors,


actions, or events in their natural setting. This method is useful
for gathering data on human behavior, interactions, without
direct intervention.

 Experiments: Experimental studies involve manipulating


variables to observe their impact on the outcome.

 Focus Groups: Focus groups bring together a small group of


individuals who discuss specific topics in a moderated setting.
This method helps in understanding the opinions, perceptions,
and experiences shared by the participants.
2. Secondary Data Collection
The next techniques of data collection is Secondary data collection
which involves using existing data collected by someone else for a
purpose different from the original intent. Researchers analyze and
interpret this data to extract relevant information. Secondary data can
be obtained from various sources, including:

 Published Sources: Researchers refer to books, academic


journals, magazines, newspapers, government reports, and
other published materials that contain relevant data.

 Online Databases: Numerous online databases provide


access to a wide range of secondary data, such as research
articles, statistical information, economic data, and social
surveys.

 Government and Institutional Records: Government


agencies, research institutions, and organizations often
maintain databases or records that can be used for research
purposes.

 Publicly Available Data: Data shared by individuals,


organizations, or communities on public platforms, websites, or
social media can be accessed and utilized for research.

 Past Research Studies: Previous research studies and their


findings can serve as valuable secondary data sources.
Researchers can review and analyze the data to gain insights
or build upon existing knowledge.

DATA COLLECTION TOOLS

Now that we’ve explained the various techniques let’s narrow our focus
even further by looking at some specific tools.
 Word Association: The researcher gives the respondent a set of
words and asks them what comes to mind when they hear each
word.

 Sentence Completion: Researchers use sentence completion to


understand the respondent's ideas. This tool involves giving an
incomplete sentence and seeing how the interviewee finishes it.

 Role-Playing: Respondents are presented with an imaginary


situation and asked how they would act or react if it were real.

 In-Person Surveys: The researcher asks questions in person.

 Online/Web Surveys: These surveys are easy to accomplish, and


there are carried out online.

 Mobile Surveys: These surveys take advantage of the increasing


proliferation of mobile technology. Mobile collection surveys rely on
mobile devices like tablets or smartphones to conduct surveys via
SMS or mobile apps.

 Observation: The researcher makes direct observations collect data


quickly and easily, with little intrusion or third-party, this method is
only effective in small-scale situations.

THE IMPORTANCE DATA COLLECTION


Accurate data collecting is crucial to preserving the integrity of research,
regardless of the subject of study.
1. Effective Decision Making
2. Accessing Performance
3. Continuous Improvement
4. Market Analysis
5. Understanding Client / Patient behavior
6. Reduces Cost
THE STAGES OF DATA COLLECTION
include:
Planning: Decide what data to collect, and set a deadline
Gathering: Collect data using surveys, online tracking, or other methods
Cleaning: Remove incorrect data and check for missing or inconsistent
data
Preparing: Organize and clean data to make it accurate and consistent
Analyzing: Use analytical tools to identify patterns, trends, and correlations
in data
Storing: Save processed data for future use.

METHODS OF MAINTAINING THE INTEGRITY OF DATA COLLECTION.


Meanwhile what is data integrity?
Data integrity refers to the assurance that an organization's data is
accurate, complete, consistent, and reliable throughout its lifecycle,

There are 2 major ways of maintaining data integrity which are?


Quality Assurance and Quality Control.

Quality Assurance
Quality assurance in data collection refers to a systematic process of
ensuring that the data gathered is accurate, complete, reliable, and
consistent by implementing procedures to identify and address potential
errors throughout the data collection process, ultimately guaranteeing the
quality of the information collected for analysis and decision-making.
Quality Control
Quality control" in data collection refers to the process of implementing
methods to identify, prevent, and correct errors in data being gathered,
ensuring its accuracy, consistency, completeness, and reliability before
analysis; essentially, it's a set of procedures to guarantee the quality of the
data collected is high enough for its intended purpose. accordance with the
manual's defined methods. Additionally, quality control determines the
appropriate solutions, or "actions," to fix flawed data gathering procedures
and reduce recurrences.

After Data Collection What Happens?

Once you’ve gathered your data through various methods of data


collection,

1. Process and Analyze Your Data: At this stage, you’ll use various
methods to explore your data more thoroughly. This can involve
statistical methods to uncover patterns or qualitative techniques to
understand the broader context. The goal is to turn raw data into
actionable insights that can guide decisions and strategies moving
forward.

2. Interpret and Report Your Results: After analyzing the data


collected through methods of data collection in research, the next
step is to interpret and present your findings. The format and detail
depend on your audience, researchers might require academic
papers, and health teams often rely on real-time feedback. What’s
key here is ensuring that the data is communicated clearly, allowing
everyone to make informed decisions.
3. Safely Store and Handle Data: Once your data has been analyzed,
proper storage is essential. Cloud storage is a reliable option, offering
both security and accessibility. Regular backups are also important,
as is limiting access to ensure that only the right people are handling
sensitive information. This helps maintain the integrity and safety of
your data throughout the project.

COMMON CHALLENGES IN DATA COLLECTION?


Some prevalent challenges are faced while collecting data. Let us explore a
few of them to better understand them and avoid them.

Data Quality Issues


The main threat to the broad and successful application of machine
learning is poor data quality. Data quality must be your top priority if you
want to make technologies like machine learning to work effectively.

Inconsistent Data
When working with various data sources, it's conceivable that the same
information will have discrepancies between sources. The differences could
be in formats, units, or occasionally spellings. The introduction of
inconsistent data might also occur during firm mergers or relocations.
Inconsistencies in data tend to accumulate and reduce the value of data if
they are not continually resolved. Organizations that focus heavily on data
consistency do so because they only want reliable data to support their
analytics.

Data Downtime
Data is the driving force behind the decisions and operations of data-driven
businesses. However, there may be brief periods when their data is
unreliable or not prepared. Customer complaints and Experimental
analytical outcomes are the only two ways this data unavailability can be
resolved.

Ambiguous Data
Even with thorough oversight, some errors can still occur in massive
databases or data lakes. The issue becomes more overwhelming when
data streams at a fast speed. Spelling mistakes can go unnoticed,
formatting difficulties can occur, and column heads might be deceptive.
This unclear data might cause several problems for reporting and analytics.

Duplicate Data
Data sources are likely to duplicate and overlap each other quite a bit. For
instance, duplicate contact information has a substantial impact on
customer experience. Marketing campaigns suffer if certain prospects are
ignored while others are engaged repeatedly. The likelihood of biased
analytical outcomes increases when duplicate data are present.

Abundance of Data
A data quality problem may occur if excessive data exists. There is a risk of
getting lost in abundant data when searching for information pertinent to
your purpose of study. Data scientists, data analysts, and business users
devote 60% of their work to finding and organizing the appropriate data.
With increased data volume, other problems with data quality become more
serious, mainly when dealing with streaming data and significant files or
databases.

Inaccurate Data
Data accuracy is crucial for highly regulated businesses like healthcare.
Given the current experience, it is more important than ever to increase the
data quality, typical example was the COVID-19 era.
Data inaccuracies can be attributed to several things, including data
degradation, human mistakes, and data drift. Worldwide data decay occurs
at a rate of about 3% per month, which is quite concerning. Data integrity
can be compromised while transferring between different systems, and
data quality might deteriorate with time.

Hidden Data
One of the major constraints for data collection is hidden data, this occurs
when researchers tend to hide useful data from the general public for
confidentiality sake.

Deciding the Data to Collect


Determining what data to collect is one of the most important factors while
collecting data and should be one of the first factors in collecting data. We
must choose the subjects the data will cover, the sources we will use to
gather it, and the required information. Our responses to these queries will
depend on our aims, or what we expect to achieve utilizing your data. As
an illustration, we may choose to gather information on the categories of
articles that website visitors between the ages of 20 and 50 most frequently
access. We can also decide to compile data on the typical age of all the
clients who purchased from your business over the previous month.

Dealing with Big Data


Big data refers to massive data sets with more intricate and diversified
structures. These traits typically result in increased challenges while
storing, analyzing, and using additional methods of extracting results.

Recent technological advancements have increased the amount of data


produced by healthcare applications, the Internet, social networking sites,
sensor networks, and many

You might also like