0% found this document useful (0 votes)

16 views60 pages

L2 ResearchDesign - BRSM Lecture2

The document discusses measurement in behavioral sciences, detailing various types of scales (nominal, ordinal, interval, ratio) and their applications. It emphasizes the importance of reliability and validity in research, explaining different methods to assess these qualities. Additionally, it addresses potential confounds and biases in experimental design, highlighting the need for careful statistical analysis and ethical considerations in research practices.

Uploaded by

u80817578

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views60 pages

L2 ResearchDesign - BRSM Lecture2

Uploaded by

u80817578

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 60

Research

Design

BRSM
Measurement in the behavioral
sciences

Measurement Examples
Define the property you want to study Aggression (operational
Find a way to detect that property definition? Measure?)
Intelligence (operational
definition? Measure?)
Productivity in the office (operational
definition? Measure?)
Age (how do you measure this? Depends..
Developmental psych? Consumer
research?)
Operational
definition
• A working definition of what
a researcher is measuring

In this task, “on target” is +/- 10% of the goal distance.

Golf Slides from Lassiter Speller (OSU).

Terminology
Variable
types: scales Nominal
of
measurement
Ordinal

Interval

Ratio
Nominal
scale Categorical

e.g. Eye color, sex

Does not make sense to say one is

greater than the other

Also does not make sense to average

them (e.g. average eye color?!)
Nominal scale
Ordinal Scale
• Slightly more structured than nominal: now you can order the
variables in some sensible way
Natural ordering of the options

• Relative to some ground truth (e.g. scientific evidence), statement

1>2>3>4

• How do we group these responses for analysis?

• If it is an ordinal scale measurement, there are some sensible ways to do this and others that don't make sense

• Again, the average does not make sense: the average endorsed statement here is 1.97
Interval scale

Both interval and ratio scales: Interval: differences between Addition and subtraction make
numerical value now can be numbers make sense, but there is sense, but not multiplication or
interpreted directly no natural "zero" on this scale division

e.g. temperature. Difference

between 20 and 17 deg celsius = 3
Psych e.g. student attitudes as a
degrees. The same as the Averages, medians, etc make
function of time elapsed since
difference between 30 and 27 sense: the average temperature
joining date – the year of entry is
degrees. There is no natural zero, for the month
an interval scale measurement
just an arbitrary point (freezing
point) chosen as a reference.
Ratio scale
Zero means zero
Can divide
e.g. Reaction times (e.g.
I'm twice as fast as you)
Continuous vs discrete variables

Examples? -- what type of scale? Discrete or • RTs – ratio scale and continuous
continuous? • Year in which participants were
• RTs? born – interval scale and discrete
• Year in which participants were born? • Temperature – interval scale and
• Temperature? continuous
• Your mode of transport to work? • Your mode of transport to work? -
• Place attained in a race? nominal and discrete
• Place attained in a race? - ordinal
and discrete
Continuous vs discrete variables
Real world variables may not always
adhere to these classifications
1. Strongly disagree
• Likert scale 2. Disagree
3. Neutral
• Choose from the following options. You feel happy today: 4. Agree
• What scale is this? 5. Strongly agree
• Nominal? (hint: is there a natural ordering? If so, it can't be nominal)
• Ratio? (hint: is there a natural "zero"?)
• Ordinal or interval. Which one is it?
• Can we prove that everybody treats the difference between 1. and 2. the same as the
difference between 4. and 5.?
• In practice, most people treat the likert scale as an interval scale since many participants
treat the entire scale seriously (but this is very much dependent on the task and context).
Is the measurement any good?

RELIABILITY: HOW REPEATABLE? VALIDITY: HOW ACCURATE IS IT

IN RELATION TO WHAT YOU
WANT TO MEASURE?
Reliability

Will we produce the same thing repeatedly?

E.g.

• Weighing machine: day 1 = 90 kgs, day 2 = 110 kgs unreliable!

• Psychology example:
• Want to measure depression
• Operational definition: Number of times you hang out with family and
friends (lower = depression)
• Measurement in July vs Nov
• Reliable?
Different ways to measure
reliability

Test-retest Inter-rater
reliability reliability

Parallel Internal
forms consistency
reliability reliability
Test-retest reliability

CONSISTENCY OVER TIME DO WE GET THE SAME RESULTS

WHEN WE TEST AT ANOTHER
TIME?
Inter-rater reliability

CONSISTENCY ACROSS PEOPLE IF SOMEONE ELSE DOES THE

MEASUREMENT, WILL WE GET
THE SAME RESULT?
Parallel forms reliability

Consistency across theoretically- If I use a different weighing scale, do I

equivalent measurements get the same weight measurement?
Internal consistency reliability

CONSISTENCY ACROSS DIFFERENT PARTS IF QUESTIONS ON FLUID INTELLIGENCE

WITH THE SAME FUNCTION SPREAD ACROSS THE IQ TEST ALL GIVE
SIMILAR ESTIMATES OF MY INTELLIGENCE,
THE TEST HAS INTERNAL CONSISTENCY
Think about the evaluation
components of this course

How good is the internal consistency of the

evaluations? (problem sets + quizzes + projects)

How about within quizzes or any given component?

Experimental
Independent variable: (IV) the
variables variable that
is manipulated Examples: amount
of light, exposure to a loud noise,
drug

Dependent variable: (DV) the

variable that is measured to see if
the independent variable had an
effect. Examples: Plant growth,
change in heart rate, anxiety
scores
• We're using the predictors to make guesses
Modern about the outcome

terminology
Experimental
Research
• The experimenter
controls everything
• Manipulates the
predictors and sees
how the outcome
changes
Practical
issues We cannot possibly think
of ALL the predictors that
can influence the outcome

How do we solve this

issue?
Randomization

everyone

Experimental Control
group group

Predictor (IV) Absence of the

predictor variable
(or a different level
of the variable)
Then compare the outcomes in the two groups
Discussion:
Effect of Examine the database of players
playing provided by a gaming company
violent video
games on Get criminal records

aggression
Test for a difference in the records
between game players and non-players

Any problems with this?

The role of confounds

• Perhaps the people playing violent video games as young children are
also ones without proper parental support
• In the previous study, there was no consideration of this potential
confound
The ideal experiment?

• Take a random sample from the population

• Randomly assign them into violent game-play vs peaceful game-play
groups
• Monitor their lives for a few decades
• Get criminal records
• This is not exactly feasible though
So what
do we do Use statistics!

then?
Incorporate confounds as covariates in your
statistical models!

I.e., we still want to understand how the outcome

(aggression) varies as the predictor value is
changed (violent game play) but now we will first
take into account what amount of the outcome is
affected by the confounding variables
Validity Confounds affect the validity of
your study

Many more factors that affect the

validity of a study

Important to examine those before

we delve into statistical methods
Validity

• Internal validity
• External validity
• Construct validity
• Face validity
• Ecological validity
Internal validity

• The ability to draw cause and effect inferences from the data
• The effect of covid (Delta) on IQ.
• Recruit govt hospital patients. Compare with healthy controls who
responded to online ads for your study.
• Internal validity?
External validity

• Generalizability of your findings

• Govt hospital COVID patients and their cognitive issues: generalizable
to the rest of the population?
• A basic perception study with college undergrads?
• A study on attitudes towards psychotherapy based on CogSci
students at IIITH?
Construct validity

• Are you really measuring what you want to be measuring?

• I want to understand the prevalence of depression in the student
population
• I post a tweet and ask people with depression to like the tweet and
others to retweet. The proportion fo students who liked the tweet =
my answer. How good is my construct validity?
Face validity

DOES YOUR TEST "APPEAR" TO BE DOESN'T REALLY MATTER FOR CAN MATTER IF YOU'RE TRYING TO
DOING THE JOB IT SAYS IT WILL DO? SCIENTISTS. CONVINCE POLICY MAKERS FOR
EXAMPLE. THEN THEIR PERCEPTION
ABOUT THE TEST WOULD MATTER.
Ecological
validity
• Does the experiment closely mimic
real-world scenarios?

• Related to external validity in that

ecological validity is supposed to help
us generalize the findings to real-world
scenarios
• Though that is not guaranteed

• e.g. eye-witness studies in the lab lack

ecological validity

• e.g. Word memory experiments

• However, insights from word memory

experiments may (and do) generalize
to more ecologically valid settings
Threats to validity

• Confounds – related to both predictors and outcomes in some

systematic way. A threat to internal validity. Why?
• Artifacts – something about the way you did the experiment that gave
you the result. A threat to external validity (but probably also internal).
Why?
History effects

Something that happens during the study (or

preceding) that can influence the results

Hospital stay, patient testing, 3rd day compared to

7th day. Electrode rearrangement surgery on day 5.
Maturational
effects
Something that changes naturally over
time that can influence your results

One big effect in psych lab experiments:

waning attention, fatigue, which increases
over the course of the experiment. How
do you know that primacy effects are not
driven by such maturational effects?
(Repeated) testing effects

Practice effects Familiarity with the test Better scores in session

2 compared to session 1
Selection
bias
• Refers to anything that
makes the groups being
compared different in
some potentially critical
aspect
• Different proportions of
males/females in the
two groups in a study on
aggression
• No more internal validity
Differential
attrition
• If you do a long study, or
a longitudinal study or
any study that requires
quite a bit of effort from
the participants, this may
be relevant.
• People drop out.
• The people dropping out
are not random people.
Homogeneous vs heterogeneous
attrition
• The rates of attrition can be the same across groups you're
comparing – homogeneous attrition
• But they can also be different! - heterogeneous attrition
• Older people for instance may not carry on with a demanding task,
and if you have a critical comparison between age groups, this can be
a major issue
Non-response
bias

• You work for a company

• You send out a survey to
1000 randomly selected
email ids from your database
• Only 200 respond
• You say you chose the initial
emails at random, so what's
the problem?
• Again, the people who
choose to respond are NOT
random!
Regression to the mean

• When you select data based on an extreme value of some measure, a

subsequent measurement will tend to "regress to the mean"
• Good examples in the textbook
• The children of tall people will tend to be taller than average but shorter than
the parents but the children of short parents tend to be taller than the
parents.
• Early studies suggested that people learn better from negative feedback
than positive feedback
• But not really, it was also an artifact of regression to the mean (Kahneman &
Tverseky, 1973)
Experimenter Bias

Oskar Pgungst: student at the Psychological Institute at the University of

Berlin, through careful experiments, showed that Clever Hans was
responding to subtle, involuntary cues from von Osten. Classic early
example of experimental design in behavioral Psychology
Demand and reactivity effects

"Hawthorne" effect

The influence of lighting on factory worker

productivity

But results were driven by the fact that workers did

better when they thought they were being observed
Solution to both
experimenter
bias and
reactivity effects

Double blind studies

Placebo effects

• The expectation of a positive effect even from an inert drug will

sometimes make people feel better
Fraud and deception

• This part is important as they are very much related to statistical

methods, inappropriate use of methods (sometimes intentionally, in
order to deceive)
Data See

fabrication https://retractionwatch.com/
https://retractionwatch.com/

People make up data!

Including some very high
profile researchers

There are data science sleuths

who detect fraud using
statistical methods
Study misdesigns

Issues with study design that don't get reported

Results may be artifacts of such misdesign

e.g. surveys that are self-evident, sit back and let reactivity decide your
results for you. If reviewers don't see the full surveys, this may not get
detected
Data mining and post-hoc
hypothesizing
• Data mining: I run 50 different variations of a model. Report only the one that
worked.
• If you are honest, your statistical methods would "correct" for the 50 times
you touched the data because we want to know that the result obtained is a
true one that is not likely to have come about due to mere chance.
• Post-hoc hypothesizing: my initial hypothesis didn't work but as part of the
data mining effort above, I found something else and reported that I had
actually hypothesized it.
• Huge statistical issue when you do this because many frequentist statistical
methods depend on assumptions made about the null hypothesis
Publication Bias

• Journals as well as authors do not publish negative findings

• Distorts the literature which comes to be dominated by small N but
"significant" studies
• Partly led to the "replication crisis" in Psychology
• Also limits what you can learn from meta-analyses/reviews.
Summary

1 2 3 4
Be aware of all the Be aware of potential Address the Be aware of dubious
different ways in confounds confounds using practices such as
which the data from a statistical methods data mining and post-
study may have issues hoc hypothesizing
with reliability/validity
Advanced topics
Install R and RStudio

• http://cran.r-project.org/
• RStudio: http://www.RStudio.org/

Psychological Assessment Module 2
No ratings yet
Psychological Assessment Module 2
30 pages
NCE Assessment and Testing PDF
100% (2)
NCE Assessment and Testing PDF
7 pages
PSY 101L: Psychological Testing: Prof. A.K.M. Rezaul Karim, PH.D
No ratings yet
PSY 101L: Psychological Testing: Prof. A.K.M. Rezaul Karim, PH.D
50 pages
PSY 107 Introduction To Quantitative Methods II - Lecture 1
No ratings yet
PSY 107 Introduction To Quantitative Methods II - Lecture 1
34 pages
Psychology Revision: Research Methods
No ratings yet
Psychology Revision: Research Methods
17 pages
Validity and Reliability
100% (1)
Validity and Reliability
6 pages
L1 Introduction To Academic and Professional Writing
0% (1)
L1 Introduction To Academic and Professional Writing
4 pages
Psychological Assessment
No ratings yet
Psychological Assessment
47 pages
Fi Tter: Curri Culum
100% (1)
Fi Tter: Curri Culum
58 pages
Lecture 2 Reseach Measurement 2020
No ratings yet
Lecture 2 Reseach Measurement 2020
28 pages
Lecture 3 (Scales of Measurement)
No ratings yet
Lecture 3 (Scales of Measurement)
48 pages
Reliability and Validity
No ratings yet
Reliability and Validity
29 pages
A1181590628 - 23746 - 19 - 2020 - Measurement and Scaling RECAP-3
No ratings yet
A1181590628 - 23746 - 19 - 2020 - Measurement and Scaling RECAP-3
20 pages
Validity and Reliability
100% (1)
Validity and Reliability
21 pages
Cooling Tower Performance Test (Id CT) : Manual Input Sheet Station: Report Date: Unit: Test Date
No ratings yet
Cooling Tower Performance Test (Id CT) : Manual Input Sheet Station: Report Date: Unit: Test Date
12 pages
Validity and Reliability
No ratings yet
Validity and Reliability
6 pages
Lecture 5
No ratings yet
Lecture 5
32 pages
Defining and Measuring Variables
No ratings yet
Defining and Measuring Variables
24 pages
The Measurement of Behaviour: Psych 3F40 Psychological Research Mike Maniaci 9 / 2 5 / 2 0 1 3
No ratings yet
The Measurement of Behaviour: Psych 3F40 Psychological Research Mike Maniaci 9 / 2 5 / 2 0 1 3
33 pages
Tickle: IQ and Personality Tests - Which Online Personality Test Are You? - Results The PROFILER Personality Test
No ratings yet
Tickle: IQ and Personality Tests - Which Online Personality Test Are You? - Results The PROFILER Personality Test
33 pages
Chapter 5 - Measurement Techniques
No ratings yet
Chapter 5 - Measurement Techniques
46 pages
Essentials of A Good Test
No ratings yet
Essentials of A Good Test
6 pages
Myers 7e IMTB Ch07
No ratings yet
Myers 7e IMTB Ch07
33 pages
Psyc 224 - Lecture 4a
No ratings yet
Psyc 224 - Lecture 4a
35 pages
Essentials of A Good Psychological Test
No ratings yet
Essentials of A Good Psychological Test
6 pages
Expe Chap 7 G3
No ratings yet
Expe Chap 7 G3
11 pages
Research Methods: It Is Actually Way More Exciting Than It Sounds!!!!
No ratings yet
Research Methods: It Is Actually Way More Exciting Than It Sounds!!!!
36 pages
Field Methods - Notes
No ratings yet
Field Methods - Notes
12 pages
EAPP A Sample Critique Paper
No ratings yet
EAPP A Sample Critique Paper
4 pages
Measurement Concepts
No ratings yet
Measurement Concepts
23 pages
Westin Aristotle's Rhetorical Energeia
No ratings yet
Westin Aristotle's Rhetorical Energeia
11 pages
Chapter 5
No ratings yet
Chapter 5
3 pages
ND Chemical Engineering
No ratings yet
ND Chemical Engineering
150 pages
Inter-Personal Communication-Listening, Feedback Collaborative Processes in Work Groups
No ratings yet
Inter-Personal Communication-Listening, Feedback Collaborative Processes in Work Groups
18 pages
Validity
No ratings yet
Validity
31 pages
Experimental Psychology, Week 7, Part 3
No ratings yet
Experimental Psychology, Week 7, Part 3
4 pages
Prsentaion of Pyschometrics
No ratings yet
Prsentaion of Pyschometrics
20 pages
Pre-Calculus First Quarter Worksheets
No ratings yet
Pre-Calculus First Quarter Worksheets
28 pages
Research 2
No ratings yet
Research 2
5 pages
Review Sheet For Experimental Exam1
No ratings yet
Review Sheet For Experimental Exam1
4 pages
Psychology: (9th Edition) David Myers
No ratings yet
Psychology: (9th Edition) David Myers
58 pages
CBSE Class 12 Mathematics Matrices & Determinants Worksheet (2) - 1
No ratings yet
CBSE Class 12 Mathematics Matrices & Determinants Worksheet (2) - 1
4 pages
Chapter 5
No ratings yet
Chapter 5
16 pages
Research Design Notes
No ratings yet
Research Design Notes
7 pages
Psy Chapter 2.2
No ratings yet
Psy Chapter 2.2
8 pages
Measurement and Data Collection
No ratings yet
Measurement and Data Collection
82 pages
Unit 2 Measurement Scales in Psychology
No ratings yet
Unit 2 Measurement Scales in Psychology
26 pages
Chapter Four Part One
No ratings yet
Chapter Four Part One
28 pages
Reliability and Validity Worksheet
No ratings yet
Reliability and Validity Worksheet
6 pages
Division Memorandum No. 0555, S. 2024 - Reiteration On The Implementation of Modular Distance Learning As Provided in DepEd Order No. 037, S. 2022.
No ratings yet
Division Memorandum No. 0555, S. 2024 - Reiteration On The Implementation of Modular Distance Learning As Provided in DepEd Order No. 037, S. 2022.
2 pages
Lab5 Section B Group2 (Assignment)
No ratings yet
Lab5 Section B Group2 (Assignment)
1 page
Ramesh Babu Pushpanathan Consultant-SAP
No ratings yet
Ramesh Babu Pushpanathan Consultant-SAP
15 pages
F23 PSY102 Notes3
No ratings yet
F23 PSY102 Notes3
11 pages
Chap4 - Functions, Pigeonhole Principle
No ratings yet
Chap4 - Functions, Pigeonhole Principle
31 pages
Chapter 3 BSRM
No ratings yet
Chapter 3 BSRM
4 pages
Ch. 1 Peer Notes
No ratings yet
Ch. 1 Peer Notes
8 pages
Quant Methods
No ratings yet
Quant Methods
36 pages
Psy 202 CA1
No ratings yet
Psy 202 CA1
6 pages
Oracle Fusion HRMS UAE HR Data Rel13 1
No ratings yet
Oracle Fusion HRMS UAE HR Data Rel13 1
138 pages
Masters Thesis Timeline
100% (3)
Masters Thesis Timeline
7 pages
Is Psychology A Science
No ratings yet
Is Psychology A Science
6 pages
Lecture 1 - Introduction and Basics - Student Version
No ratings yet
Lecture 1 - Introduction and Basics - Student Version
53 pages
Fundamental Research Issues: Ali Waqas Sadia Zafar
No ratings yet
Fundamental Research Issues: Ali Waqas Sadia Zafar
17 pages
Lesson 1
No ratings yet
Lesson 1
14 pages
Omtech Cabinet Laser Engraver User Manual (USB-0604-U0)
No ratings yet
Omtech Cabinet Laser Engraver User Manual (USB-0604-U0)
48 pages
Septic Tank & Leach Field
No ratings yet
Septic Tank & Leach Field
1 page
Hydraulic Slurry Pump Manual
No ratings yet
Hydraulic Slurry Pump Manual
4 pages
Engineered Ferrites and Their Applications: Pankaj Sharma Gagan Kumar Bhargava Sumit Bhardwaj Indu Sharma Editors
No ratings yet
Engineered Ferrites and Their Applications: Pankaj Sharma Gagan Kumar Bhargava Sumit Bhardwaj Indu Sharma Editors
261 pages
Generation Gap
No ratings yet
Generation Gap
3 pages
CVATSFriendly 1706242751 582645 344192
No ratings yet
CVATSFriendly 1706242751 582645 344192
1 page
Module 6
No ratings yet
Module 6
10 pages
Day 6
No ratings yet
Day 6
3 pages
FDT Crusherun L 1
No ratings yet
FDT Crusherun L 1
1 page
MSC Statistics Booster
No ratings yet
MSC Statistics Booster
107 pages
Math 237 Week 3
No ratings yet
Math 237 Week 3
12 pages
BRM Detailed
No ratings yet
BRM Detailed
92 pages
D1100 CH 02 Methods, Slides Notes
No ratings yet
D1100 CH 02 Methods, Slides Notes
52 pages
YE Example
No ratings yet
YE Example
4 pages
Expp211 Midterms
No ratings yet
Expp211 Midterms
5 pages
APSS5065 - Week 5 Notes
No ratings yet
APSS5065 - Week 5 Notes
10 pages
Psy 5 Finals Reviewer Ailr
No ratings yet
Psy 5 Finals Reviewer Ailr
8 pages
Lecture7 Ch2
No ratings yet
Lecture7 Ch2
31 pages
March Test Grade 10 2024
No ratings yet
March Test Grade 10 2024
7 pages
Radiation Protection in Medical Radiography 9th Edition Sherer Solution Manual Full Download
100% (2)
Radiation Protection in Medical Radiography 9th Edition Sherer Solution Manual Full Download
411 pages
2025 Grandiose Mock - Science 2
No ratings yet
2025 Grandiose Mock - Science 2
4 pages

L2 ResearchDesign - BRSM Lecture2

Uploaded by

L2 ResearchDesign - BRSM Lecture2

Uploaded by

Research

In this task, “on target” is +/- 10% of the goal distance.

Golf Slides from Lassiter Speller (OSU).

e.g. Eye color, sex

Does not make sense to say one is

Also does not make sense to average

• Relative to some ground truth (e.g. scientific evidence), statement

• How do we group these responses for analysis?

e.g. temperature. Difference

RELIABILITY: HOW REPEATABLE? VALIDITY: HOW ACCURATE IS IT

Will we produce the same thing repeatedly?

• Weighing machine: day 1 = 90 kgs, day 2 = 110 kgs unreliable!

CONSISTENCY OVER TIME DO WE GET THE SAME RESULTS

CONSISTENCY ACROSS PEOPLE IF SOMEONE ELSE DOES THE

Consistency across theoretically- If I use a different weighing scale, do I

CONSISTENCY ACROSS DIFFERENT PARTS IF QUESTIONS ON FLUID INTELLIGENCE

How good is the internal consistency of the

How about within quizzes or any given component?

Dependent variable: (DV) the

How do we solve this

Predictor (IV) Absence of the

Any problems with this?

• Take a random sample from the population

I.e., we still want to understand how the outcome

Many more factors that affect the

Important to examine those before

• Generalizability of your findings

• Are you really measuring what you want to be measuring?

• Related to external validity in that

• e.g. eye-witness studies in the lab lack

• e.g. Word memory experiments

• However, insights from word memory

• Confounds – related to both predictors and outcomes in some

Something that happens during the study (or

Hospital stay, patient testing, 3rd day compared to

One big effect in psych lab experiments:

Practice effects Familiarity with the test Better scores in session

• You work for a company

• When you select data based on an extreme value of some measure, a

Oskar Pgungst: student at the Psychological Institute at the University of

The influence of lighting on factory worker

But results were driven by the fact that workers did

Double blind studies

• The expectation of a positive effect even from an inert drug will

• This part is important as they are very much related to statistical

People make up data!

There are data science sleuths

Issues with study design that don't get reported

Results may be artifacts of such misdesign

• Journals as well as authors do not publish negative findings

You might also like