COE107 – Lesson 1:
INTRODUCTION TO
DATA COLLECTION AND
SURVEYS
DISCLAIMER
Fair use of a copyrighted work as defined in sec. 185 of RA 8293, which
states, “the fair use of a copyrighted work for criticism, comment, news
reporting, teaching including multiple copies for classroom use, scholarship,
research, and similar purposes is not an infringement of copyright.”
This PowerPoint Learning Material is prepared and compiled by Engr.
Raymart Mallari solely for students of Pamantasan ng Cabuyao (PnC) under the
College of Engineering enrolled in COE107 for EcE course for AY 2023-24.
INTENDED LEARNING OUTCOMES
After this orientation, the students will be able to:
• Differentiate between primary and secondary data sources
and assess their suitability for various scenarios.
• Design and implement effective data collection plans,
including surveys, for specific engineering problems.
• Analyze potential biases in data collection methods and
evaluate their impact on engineering data analysis.
INTRODUCTION
Primary vs Secondary Data
Primary Data Sources
• Primary data sources refer to data collected firsthand by the
researcher for a specific research purpose. This type of data
is original and has not been previously published.
Primary vs Secondary Data
Advantages
• Directly relevant to the research question.
• Up-to-date and specific.
• High level of control over the data collection process.
Primary vs Secondary Data
Disadvantages
• Time-consuming and costly.
• May require specialized equipment or expertise.
• Potential for researcher bias in data collection.
Primary vs Secondary Data
Secondary Data Sources
• Secondary data sources refer to data that has been
collected, processed, and published by someone else for a
different purpose. Researchers use this pre-existing data for
their own research.
Primary vs Secondary Data
Advantages
• Easily accessible and less expensive.
• Saves time since the data is already collected and processed.
• Can provide a broader context and background information.
Primary vs Secondary Data
Disadvantages
• May not be perfectly aligned with the specific research
question.
• Potentially outdated.
• Limited control over data quality and methodology used in
data collection.
Primary or Secondary Data?
A researcher conducts a lab experiment to determine the
effect of temperature on the conductivity of a specific
material.
Primary or Secondary Data?
An engineer designs a survey to gather feedback from users
about a new software application.
Primary or Secondary Data?
A business analyst uses data from a database like the World
Bank or UN data repository to study global economic
indicators.
Primary or Secondary Data?
A historian uses data from historical records and books to
study the economic trends of the 19th century.
Primary or Secondary Data?
An engineer designs a survey to gather feedback from users
about a new software application.
Primary or Secondary Data?
A medical researcher reviews previously published journal
articles to gather data on the prevalence of a specific disease.
Primary or Secondary Data?
A sociologist conducts face-to-face interviews with
participants to study their attitudes towards climate change.
Primary or Secondary Data?
An economist uses data from government census reports to
analyze population growth trends.
Primary or Secondary Data?
A biologist observes the behavior of animals in their natural
habitat and records the findings.
Data Collection Plan
• An effective data collection plan is crucial for ensuring that
the data gathered is relevant, accurate, and reliable.
• The plan should be systematic and well-structured,
encompassing several key elements.
Objectives
• clear, concise statements about what the data collection
aims to achieve.
Example:
If you're studying the impact of a new teaching method on
student performance, your objective might be to measure
changes in test scores before and after implementing the
method.
Methods
• The techniques or procedures used to gather data.
Example:
1. Survey - distributing questionnaires to collect responses from a
large group.
2. Experiments - conducting controlled tests to observe specific
outcomes.
3. Interviews - gathering detailed information through direct
interaction with participants.
Instruments
• Tools or devices used to collect data.
Example:
1. Questionnaires - a set of questions designed to gather
specific information from respondents.
2. Sensors - devices that measure physical properties, like
temperature or pressure.
3. Software - applications that collect and analyze digital data.
Sampling
• The process of selecting a subset of the population to represent
the entire population.
Example:
1. Random Sampling - selecting participants randomly to avoid
bias.
2. Stratified Sampling - dividing the population into subgroups
and sampling from each subgroup.
3. Convenience Sampling - choosing participants who are easily
accessible.
Timeline
• A schedule that outlines when each step of the data
collection process will take place.
• Example:
1. A timeline for a survey might include designing the survey
(Week 1), piloting the survey (Week 2), distributing the
survey (Weeks 3-4), and analyzing the data (Week 5).
Ethical Considerations
• Ensuring the data collection process respects the rights and
well-being of participants.
Ethical Considerations
1. Respect for Human Dignity - All research activities must respect
all human beings’ inherent dignity and worth.
2. Informed Consent - participants must be fully informed about
the study and agree to participate voluntarily.
3. Privacy and Confidentiality - Ensuring that participants' data is
kept private and secure.
4. Responsibility for Safety - Avoiding any actions that could harm
participants.
Ethical Considerations
5. Integrity and Accountability - ensure that the research is
conducted in a transparent and responsible manner and
that all data collected is accurate and verifiable.
Data Management
• How the collected data will be stored, organized, and protected.
Examples:
1. Data Storage - using secure servers or cloud storage to keep
data safe.
2. Data Organization - Labeling and categorizing data
systematically.
3. Data Protection - Implementing measures like encryption to
protect sensitive data.
Data Management
• How the collected data will be stored, organized, and protected.
Examples:
1. Data Storage - using secure servers or cloud storage to keep
data safe.
2. Data Organization - Labeling and categorizing data
systematically.
3. Data Protection - Implementing measures like encryption to
protect sensitive data.
Common Biases in Data Collection
• Bias in data collection refers to systematic errors that can
skew results and lead to incorrect conclusions.
• Recognizing and addressing these biases is essential for
conducting valid and reliable research.
Selection Bias
Occurs when the sample selected for the study is not
representative of the population.
Causes:
• Non-random sampling methods.
• Exclusion of certain groups.
Selection Bias
Example:
Conducting a survey on customer satisfaction using only
responses from high-income neighborhoods.
Impact:
Results may not be generalizable to the entire population,
leading to skewed conclusions.
Measurement Bias
Arises from inaccuracies in the data collection instruments or
procedures.
Causes:
• Faulty or uncalibrated instruments.
• Inconsistent data collection methods.
Measurement Bias
Example:
Using a poorly calibrated thermometer that consistently reads
temperatures 2 degrees higher than actual.
Impact:
Data collected may be systematically incorrect, leading to
invalid results.
Response Bias
Occurs when participants do not respond truthfully or
accurately.
Causes:
• Social desirability - Participants may give answers they think
are socially acceptable.
• Acquiescence - Participants may agree with all statements in
a survey.
Response Bias
Example:
In a survey about exercise habits, participants may overreport
their physical activity levels to appear healthier.
Impact:
Results may not accurately reflect the true behaviors or
attitudes of participants.
Observer Bias
Happens when the researcher's expectations influence their
observations or interpretations.Causes:
• Preconceived notions or expectations about the study
outcome.
Observer Bias
Example:
A researcher expecting a drug to be effective may interpret
ambiguous patient symptoms as signs of improvement.
Impact:
Data may be interpreted subjectively, leading to biased
conclusions.
Sampling Bias
Arises when some members of the population are less likely
to be included in the sample than others.
Causes:
• Convenience sampling - selecting a sample that is easy to
access.
• Volunteer bias - relying on participants who volunteer, who
may not be representative.
Sampling Bias
Example:
Conducting an online survey that excludes individuals without
internet access.
Impact:
Results may not accurately represent the entire population,
leading to biased findings.
Recall Bias
Occurs when participants do not remember past events
accurately.
Causes:
• Time lapse - the longer the time since the event, the less
accurate the recall.
Recall Bias
Example:
Asking participants to recall their diet over the past year may
lead to inaccurate or incomplete responses.
Impact:
Data may be unreliable, affecting the validity of the analysis.
How do we mitigate bias?
1. Random Sampling
2. Blinding
3. Calibration and Standardization
4. Clear and Neutral Questioning
5. Training and Protocols