0% found this document useful (0 votes)
118 views39 pages

Data Collection

Here are the key points about retrospective studies: - Look back in time to analyze existing data that has already been collected previously - Less expensive than collecting new data since it uses existing data - A limitation is that the data may not contain all the relevant information needed since the analyst was not involved in the original data collection design. Important variables could be missing. So in summary, retrospective studies are cheaper but data limitations may exist since the analyst did not design the original data collection.

Uploaded by

AG Gwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
0% found this document useful (0 votes)
118 views39 pages

Data Collection

Here are the key points about retrospective studies: - Look back in time to analyze existing data that has already been collected previously - Less expensive than collecting new data since it uses existing data - A limitation is that the data may not contain all the relevant information needed since the analyst was not involved in the original data collection design. Important variables could be missing. So in summary, retrospective studies are cheaper but data limitations may exist since the analyst did not design the original data collection.

Uploaded by

AG Gwan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
You are on page 1/ 39

GSF6011

Research
Methodology
Data Collection
Jane Labadin
• I’m just a mathematician
• Always a first time 
• Deliver based on experience

Caveat
original information which is
collected, stored, accessed, used or
disposed of during the course of the
research, and the final report of the
research findings

What is data?
• Can you do research without data?
• How can you resolve the research problem without
supporting data?
• How do you convince others, that your data are sufficient
to support the solution?
• Where do you go to find data?
• Can you have imaginary data in research?
• Can you have data simulation for research?

Motivation
collection of information (data) which
can be interpreted or analysed to
frame answers to your research
questions or increase knowledge of
your research topic.

Purpose of Collecting
Data
1. Sources – where will you get the information

2. Methods – how will you collect/gather the


information

What does collecting


data involves?
• From where or from whom will you get the information?
• Existing information – records, reports, program
documents, logs, journals
• People – respondents or informants
• Pictorial records and observations – video or photos,
observations or events, artwork, etc.

SOURCE of Information
• Testimonials
• Survey
• Physical evidence
• Interview
• Time series
• Observation
• Tests
• Group assessment
• Photographs, videos
• Expert or peer
reviews • Diaries, journals, logs
• Portfolio reviews • Document review and
analysis

METHODS of data
collection
• Obtained via
• surveys of populations
• repeated experimental procedures
• …
• When recording, include
• detailed information
• Dates
• place of collection
• methods of measurement
• units of measurement) to minimise confusion
• Recorded on
• printed datasheets, then stored in spreadsheet format.
• In some cases, data may be recorded by handheld computers or specialised data
recorders which can later be downloaded to more secure devices.
• Data recorders can often be set up to record data remotely, without the
requirement that researchers be present. Such techniques are frequently used in
meteorological research or in situations where it would be too hazardous for a
researcher to be present (eg industrial chemistry applications, space research).

Quantitative Information
• May be in the form:
• Recordings of interviews
• Transcribed into written form
• Supporting notes
• Description of text/artefacts/system
• Interpretation of text/artefacts/system

Qualitative Information
• Source of Data

• Quantitative data are values on a numerical


scale

• Qualitative data are observation measured on a


numerical scale

In a nutshell
Source of Data 

Source of data

Quantitative Qualitative
(numerical) (categorical)

Discrete Continuous Discrete


• Discrete Data
– Only certain values are possible (there are gaps between
the possible values)
• Continuous Data
• – Theoretically, any value within an interval is possible
with a fine enough measuring device

Quantitative data
• Primary data: data observed and recorded or collected
directly from respondents

• Secondary data: data complied both inside and outside


the organization for some purpose other than the current
investigation

Types of Data
Primary Secondary
Data Collection Data
Compilation

Print or Electronic

Observation Survey

Experimentation

Basic Business Statistics 10e, 2006 Prentice Hall

Types of Data
Ratio Data Differences between
measurements, true
Height, Age, Weekly
zero exists Food Spending

Differences between Temperature in Fahrenheit,


Interval Data measurements but no
true zero
Standardized exam score

Service quality rating,


Ordinal Data Ordered Categories
(rankings, order, or Standard & Poor’s bond
scaling) rating, Student letter grades

Nominal Data Categories (no ordering


or direction)
Marital status, Type of car
owned
Basic Business Statistics 10e, 2006 Prentice Hall

Categorical Data
• Information?

• What? Where? How?

Take a step back …


• Availability
• Training/expert assistance
• Pilot testing
• Interruption potential
• Protocol needs
• Reactivity
• Bias
• Reliability
• Validity

Things to consider
• How much information should you collect?

• Selecting a portion of subjects in order to learn something


about the entire population without having to measure the
whole group.

• Random? Size? Bias?

Sampling
“The natural state of most engineering
information contains significant
variability”
Ang and Tang, Probability Concepts in Engineering Planning and
Design

Research Methods in
Engineering & Computer
Science
Qualitative Research Quantitative Research
• Concentrates on collecting and • Concentrates on what can be measured.
analyzing subjective data • Involves collecting and analyzing objective data
• Usually the perceptions of the • Usually involves some form of math
people involved – Statistical
• Intention is to illuminate – Calculus
perceptions and, thus, gain – Discrete
– greater insight (explain why) and
– Knowledge (reproduce or
recognize).

Department of Computer Science Faculty of Science


LEADERS OF TOMORROW Department of Computer Science Faculty of Science
LEADERS OF TOMORROW

Research Methods
s monkey doing?

er Science What is this monkey


Faculty of Science
LEADERS OF TOMORROW

doing? Quantitative vs. Qualitative


Quantity vs Quality
• Quantitative
– We have an hypothesis that monkeys will put bananas to their
ears
– We gave bananas to monkeys
– If we say banana to ear == “Monkeycide”
– We counted xx instances of Monkeycide over yy trials
– Our hypothesis is accepted if xx > 0
• Qualitative
– We saw monkeys pick up bananas
– We observed the monkeys placing bananas to their ears
– From observation we have the concept: “Monkeycide”
– Monkeys Jenny, Irene and Blake exhibited Monkeycide

Department of Computer Science Faculty of Science


LEADERS OF TOMORROW
What is your research?
t is Experimental Design? 
What is Experimental Design? 
ental design: 
• An experimental design: 
i?onal approach to conduc?ng 
e research” 
 “Is the tradi?onal approach to conduc?ng 
quan?ta?ve research” 
Sumber: Creswell, J.C. 2005

Sumber: Creswell, J.C. 2005

Branch of Quantitative
Research
An Example…

An engineer is tasked to develop a rubber compound for use in O-rings which


are to be used as seals in plasma etching tools. Resistance to acids and other
corrosive substances is important! Using a standard rubber compound, the
engineer produces 8 O-rings and tests their tensile strength after immersion in
nitric acid. She obtains the following results, plotted as a dot-diagram:

These measurements contain variability. What is the cause of the variability??


Now, the engineer is not happy with the location and scatter
(variability) of the data. The mean tensile strength is too low
and the variability too high. She decides to modify the
formulation with a Teflon additive, makes eight new O-rings,
and tests those.

•Will another set of O-rings give yet another set of results?


•Is a sample size of 8 adequate to give reliable results?
•What risks are associated with the assumption that the Teflon
additive leads to increased tensile strength?

Statistical Inference can help us answer these questions.

ACR 4
Data Collection Method 1: Retrospective Study

•Go back in time and analyze data that has


already been collected over some period.
•Might be interested in using this data to
construct an empirical model.

Pros: cost of collecting data minimized by taking advantage of existing


data
Cons: Because the analyst was not involved in data collection, often
appropriate data is not collected.
For example: In the distillation column, operators don’t ever
change the reflux rate. Since reflux rate variability is small over
time, it will be difficult to tell if it affects the final concentration.
Other Cons: Missing data, reliability questionable, irrelevant data,
inappropriate use of data, not enough accompanying info

ACR 6
Data Collection Method 2: Observational Study

•Real time record of data, with no further


interference.
•Engineer creates a data collection form for
operators to fill out at specified times – leaves
space for comments on anything unusual that
may have occurred.
•Engineer still stuck with the problem that the
system, as is, may not provide information
relevant to the question at hand.
Data Collection Method 3: Designed Experiments

•In a designed experiment, the engineer makes purposeful changes in


controllable variables (factors) of the system and observes the resulting
system output.
•He/she then makes a decision or an inference about which variables
are responsible for the changes in output performance observed.
• Linking probability and statistics…

Probability

Population Sample

Inferential
Statistics

•Populations can be physical or conceptual.


•Samples should be random, and not based on judgment or convenience
•Simple random sample: a sample of size n that has been selected from
a population in such a way that each possible sample of size n has
equally likely chance of being selected.
9
Statistics and Models
•Sometimes our models are very physical in nature (mechanistic
model)
Modeling the current flow in a thin copper wire
Current=voltage/resistance
I=E/R (Ohm’s Law)
•Since we know variability exists in measurements, a more realistic
model might be
I=E/R + Є

•Sometimes we don’t have a physical law to explain the relationship


between variables. In this case we develop an empirical model.
Relationship between pull strength (y) to wire length (x1) and die height (x2) for
determination of strength of a wire bond in a semi conductor frame
y=F(x1,x2) + Є
Observing processes over time
Plotting data over time allows an engineer to better understand how
phenomena affect a system’s stability over time.

Dot diagram has no time


information.

Time series plot gives us


another dimension to analyze.

ACR 12
Table 1 Questions asked by software engineering researchers (column 2) that can be answered
by field study techniques
Used by researchers Also used
when their goal is Volume by software
Technique to understand: of data engineers for
Direct techniques
Brainstorming Ideas and general Small Requirements
and focus background about gathering, project
groups the process and product, planning
general opinions
(also useful to enhance
participant rapport)
Interviews and General information Small Requirements
questionnaires (including opinions) to large and evaluation
about process, product,
personal knowledge etc.
Conceptual Mental models of Small Requirements
modeling product or process
Work diaries Time spent or frequency of certain Medium Time sheets
tasks (rough approximation,
over days or weeks)
Think-aloud Mental models, goals, Medium UI evaluation
sessions rationale and patterns to large
of activities
Shadowing and Time spent or frequency of tasks Small Advanced
observation (intermittent over relatively approaches to
short periods), patterns of use case or task
activities, some goals and analysis
rationale
Participant Deep understanding, goals and Medium
observation rationale for actions, time to large
(joining the spent or frequency over
team) a long period
Indirect techniques
Instrumenting Software usage over a long Large Software
systems period, for many participants usage analysis
Fly on the wall Time spent intermittently in one Medium
location, patterns of activities
(particularly collaboration)
Independent techniques
Analysis of work Long-term patterns relating to Large Metrics
databases software evolution, faults etc. gathering
Analysis of Details of tool usage Large
tool use logs
Documentation Design and documentation Medium Reverse
analysis practices, general engineering
understanding
Static and dynamic Design and programming Large Program
analysis practices, general comprehension,
understanding metrics, testing,
etc.
Ask Question

Do background Research

Construct Hypothesis Think again

Test with an Experiment

Analyse Results Draw Conclusion

Hypothesis is true Hypothesis is false or Partially true

Report results

//www.sciencebuddies.org/ml
Engineering & Computer Science
Why experiment?
• Will I be using a formal, objective, systematic process
where data are utilized to test my research hypothesis?
• What are the variables I will consider in my study?
• Independent Variable(s)
• Dependent Variable(s)
• Extraneous Variable(s)
• What type of quantitative investigation will I pursue?

Ask yourself
Insist on precision only
when is needed
L
r
Montgo
Faculty of Science
LEADERS OF TOMORROW Personal Errors
Master of Inform
Faculty of Comp
• Ensure no distraction– eg:
• Check list
• Place the apparatus at a c

Acknowledgement
• General lighting should be
• Ventilation should be adeq

You might also like