RiP Final Study
RiP Final Study
o Is written as ŷ = 𝑏 0 + 𝑏 1 𝑥
Predictions
Residuals
Significance
𝐻0: 𝛽1 =0
𝐻𝐴: 𝛽1 ≠0
𝐻0: 𝜌2 =0
𝐻𝐴: 𝜌2 >0
- It is usefull in 2 scenarios:
- E.g.
o Job satisfaction
- We write ŷ = 𝑏 0 + 𝑏 1 𝑥 1 + 𝑏 2 𝑥 2 + ⋯ + 𝑏 𝑘 𝑥 𝑘
Coefficients
Significance
On at a time!!!
o H0:r2 =0
o HA:r2 >0
o Note: this is a test for significance of the entire model (with all
predictors)
- For example:
o H0: bPTSD = 0
o HA: bPTSD ≠ 0
Asumptions
Assumption: homoscedasticity
Assumption: Linearity
Assumption: no outliers
- Needed:
o n
o F and p
o R2
- Optional
o SE
- Create a table with values of Beta and p for each avriable in the
rows
- Explain any abbreviations you may have used in the note as well
Causality
- The best way to meet all the conditions is by means of a
randomised experiment
Randomised experement: a research design where by randomisation
groups can be assumed to be similar, one variable is manipulated (varied)
by the resreacher, and the researcher measures the effect of this
manipulation on another variable (the outcome).
Covariance
- When do we speak of a relationship between note taking mode and
exam score?
o When we see a difference in exam scores between students
who use the two different note taking techniques
Temporal precedence
- An experiment allows the researcher to ensure that the cause
precedes the outcome
- By applying the manipulation before measuring the dependent
variable
Internal validity
- Alternative explanations for the relationship should be rules out
- Is there a manipulated variable that explains the group difference or
is tehre an alternative explanation?
- Important role for:
o Design of the experiment
o Treatment of the participants
o Etc.
Research question
- An experimental research question can be identified by the following
elements:
- PICO:
o Population the group of people the researcher whishes to
investigate
o Intervention The experimental condition
o Comparison the control group
o Outcome the dependent variable
Research design
- The researcher chooses among whom and how the data will be
collected
- The researcher starts with a sample of participants
o Preferably a random sample
- Randomised experiment:
o Experimental group
o Control group
Random assignment
- Key of the true experiment
- Random procedure determines group assignemnt
o Treatment/experimental group
o Control group/placebo group
- Observed and unobserved factors are equally likely in both groups
- Transparent, reproducible
- Allows causal claims
Experimental designs
- Random assignment plays an integral role in experiment
- Experiments can be designed in different ways
Design 1: Posttest-Only-Design
- First, subjects are randombly assigned to the experimental group
and the control group
- After the treatment, the outcomes of the two groups are compared
Design 2: Pretest-Posttest-Design
- Aka the classical experiment
- A pre-test is added before the treatment
Design 6: Quasi-experiments
1. Learning effects
- Repeated-measures design
Learning effects: are aka order effects, practice effects, testign effects
2. Design confounds
- A confoudning variable is a second variable that happens to vary
systematically along with the intended independent variable
Confounding varibale: a second variable that happens to vary
systematically along with the intended independent variable
- This avriable is tehreof an alternative explanation for the results
- Note: this does not apply to variables that vary randombly between
the groups/participants
3. Selection effects
- Were the groups comparable ar the start of the experiemnt?
o With respect to the dependent variable?
o With respect to other variables (observed and unobserved)?
- If, for some reason, the groups turn out to be not comparable at the
start of the experiment, we speak of a selection effect.
Selection effects: the group turns out to not be comparable at the start of
the experiment.
- Random assignment reduces selection effects to a minimum.
o The goal: making sure that the mean and variance ins cores,
on all variables, measured and unmeasured, are similar for
both groups at the onset of the study
o Issues:
Sometimes impossible
Non-ethical
Infeasible
Sometimes possible, but things fo wrong
4. Contamination
- Participants in the experimental group communicate with
participants in the control group
- Participants do not adhere to the treatment
- Influence from researcher(s)
8. Attrition
- When participants drop out during an experiment or study, this can
affect the results it is called attrition
Attrition: when participants drop out during an experiment or study
- This is espescially a problem if the people who drop out are
systematically different from the people who continue to participate
Independent t-test
Inferential statistics: experiments
- When researchers conduct experiments, they wish to test if there is
a difference
o Between groups or
o Between times of measurement
- The proces they follow is similar to the process for correlational
research
o 1. Follow the theory data cycle
o 2. Many researchers choose to follow the steps of NHST in the
place of date analysis
Example
- Randomized experiment
o Group 1:
Control group, n1 = 40
o Group 2:
Experimental group, n2 = 40
1. Formulating hypothesis
- Research hypothesis:
On average, people who. Stdy words in large font score highr on the
recall test that people who study in regular font size
- Null hypothesis:
On average, people who study words in large font and people who
study words in regular font score the same on the recall test
- Statistical hypothesis:
- Things to consider
- Units of measurement
- Spread in measurement
Test statistic t
- This standard error contains the group sizes (n1 and n2) and spread
in scores in both groups (SD1 and SD2)
- Whe units no longer play a role, since M1, M2, and the SE are all
measured in the same units
- We call this the test statistic t values of t are always on the same
scale
o Values of t that are far from zero will be found less often
o Reject Ho
o Do not reject Ho
o Reject Ho if p<alpha
- Remember:
o P-value
o Calculated by software
- Interpretation of p-value
- Situation 1:
4. Decision about Ho
- Decision:
o Do not reject Ho
- Conclusion:
o The size of the font of the list of words has no significant effect
on the recall scores
- Decision:
o Reject Ho
- Conslusion:
o The size of the font of the list of words has a significant effect
on the recall scores. On average, people who study in a large
font score higher on the recall test than people who study
words in regular size font
A closer look a t
Test Statistic t
Formula:
Power
Choices
- When conductiong a hypothesis
test, researchers must always
make a choice:
o Reject Ho
o Do not reject Ho
Making choies = sametimes
making mistakes
- When the null hypothesis is yrue,
and researchers choose to reject the null hypothesis, they make a
type I error
- When the null hypothesis is not true, and the researcher does not
reject the null hypothesis, they make a type II error
Type I error
- Researchers consider making a type 1 error worse of the two
mistakes
- In NHST the null hypothesis is protected by making the chance of
making a type I error small
Choice of alpha
- Choice of alpha depends on
o Research situation
o Severity of consequences
- Imagine two researchers evaluate the effectiveness of a treatment
for depression:
o Mindfulness training: relatively cheap training, no adverse side
effect
o Lithium: relatively expensive drug, serious risk of side effects
Type II error
- Chance of type II error = beta
- Value of beta is indirectly related to the value of alpha
o If alpha high then beta low
o If alpha low then beta high
o NB: not by same amount
The inverse of a type II error
- Researchers are interested in the inverse of a type II error:
o Type II error: the researcher concludes – based on the sample
evidence – that there is no difference between two groups,
when – in reality – there is a difference in the population
o Inverse: the researcher concludes – based on the sample
evidence – that there is a difference between two groups,
when – in reality – there is difference in the population
Chances
- The chance of a type II error was dneoted by beta
- The chance of the inverse of the type II error, the chance of finding
the difference that actually exists is then 1-beta
- This chance is also referred to as the power of the test
Power = the chance of correctly rejecting Ho
- Example:
- On experimental research:
o Participants are randomly assigned to the groups
o Independent variable is manipulated by the researcher
Comparing groups
- The t-test can be used to compare two groups
- Three scenarios:
o 1. Two groups of a randomised experiment
o 2. Two existing groups, where an independent variable is
manipulated
A kind of experiment eithout randomisation but with
manipulation is called a quasi-experiment
o 3. Two existign groups, where nothing is manipulated
Comparison of two groupd without randomisation and
without manipulation is called a non-experiment
This is no longer experimental research but correlational
research
Scientific integrity
- European code of conduct
o Four principles which are the basis of integrity in research
Reliability
Honestly
Respect
Accountability
Idea/Theory
- Theory:
o Degradation of the (cleanliness of the) streets leads to
stereotyping and discrimination
- Research question:
o Do people exhibit more discriminatory behaiviour on dirty
stations than on clean stations?
Experiment
- Same questionnaire
- At the dirty station, participants were more likely to sit further away
from the person on the bench if theu had a different ethnic
background
- Fabrication:
o Make up data
o Deliberate violations
- Plagiarism:
o Deliberate violation
- = examples of honesty
- Data faslification:
HARKing
QRP
o P-hacking
o = HARKing
Solutions
- Retraction
o Form of self-correction afterwards
o Has drawbacks:
Reputational damage researcher
Reputational damage science in general
Often a long time between publication and retraction
- Post Publication Peer Review (PPPR)
o Online discussion platform about publications
Authors
Editors
Peers
- On Honestly and Accountability
- Pre-registration
o Mandatory submission of research protocol before execution
of actual research
Hypotheses
Methodology
Expectation
o Publication independent of outcome
- Replication
o As a regular part of the research cycle
Statistical validity
Construct validity
- How well were the variables manipulated/measured?
- What was the manipulated variable?
o The independent variable
o The dependent variable:
Score on 10 questions about the facts
Score on 10 questions about relationships between facts
External validity
- Non-random sample leads to lower external validity
- In experimental research, this is not always problomatic
Statistical validity
- P<alpha, so Ho was rejected
o The researchers conclude that there is an effect of interim
revision of notes on learning achievement
o Diffeence between group 1 and group 2 is significant, but:
With a large sample, a small difference canalready be
signficant
A significant effect is not the same as a large effect
Important question to ask: how big is that
difference/effect
Effect size
- Difference between the two groups: M1-M2
- Intervention: revision
- Comparison: recopy
- Cohen’s d
o Measure of relevance
o AKA standardised mean
difference
o Expressed the difference
between the two means in the
number of standardised
deviations:
o Guidelines for interpretation:
Confidence interval
- Another way to describe the size of
the difference between the two
groups is with a confidence
intervan (CI)
- What is a confidence interval?
o How can we use it?
- Recall:
o Every sample mean differs
from the population mean
o In the same way: the
difference between two
samples means differs from the difference in poupaltion
means
- Is this point estimate informative?
o Point estimate gives false certainty
o Better option is interval of probable values based on sample
data
- An interval of probable values can be:
o We expect thre true mean age of students at UCU to be
between 19.5 and 20.5: [19.5,20.5]
o The correlation between self esteem and extraversion is
estimated to fall between .10 and .25: [.10, .25]
o The difference between the mean scores using two different
teaching techniques is estimated to fall between 1.1 and 3.4
points: [1.1,3.4]
- The intervan used in NHST is called Confidence interval
o Width of interval says something abut the accuracy of the
estimation
o Researchers would like to see a narrow(?) interval
o Width of the interval depends on:
Sample size
Spread/variation in scores in population
Chosen confidence level
Width Confidence interval
- Width of the interval depends on:
o Sample size:
Larher sample gives more information and therefore
more certainty
Larger sample gives a smaller standard erro narrower
interval
o Spread in scores in population
Greater spread in scores in population gives greater
spread in scores in sample, so more uncertainty wider
interval
- Researcher often chooses level of confidence that matches the level
of significance
- Widely used significance level is alpha = .05, thus confidance level
of 95%
- A single confidance interval gives us an interval of plausible values
for the value in the population and we have confidence in the
process that is used
- With a single confidence interval, we don’t know if it is 1 out of 95%
or 1 out of 5%
o Chosen confidence level
Higher confidence level gives more certainty, but wider
interval
With a 99%CI, the interval is more likely to fall around
the population value
With a 90%CI, we have less certainty
Higher confidence level wider interval
Notes on relevance
- Relevance is assessed using a measure of effect size
o With a t-test, we use Cohen’s d
o With a regression analysis, the effect size is measured using
R2 (squared)
o With a Chi-squared test, we use a measure called Cramer’s V
o 𝐻0: 𝜇1 = 𝜇2
Effect size
- Measure of effect size for Chi-squared test is Cramer’s V
- Value between 0 and 1
- Measures the strength of dependency between the two nominal
variables
- “kind of” similar to a correlation
o 𝐻0: 𝜇1 = 𝜇2 or 𝐻0: 𝜇1 − 𝜇2 = 0
o 𝐷 = 𝑋after − 𝑋before
- We then get:
o 𝜇𝐷 = 𝜇after − 𝜇before
o 𝐻0: 𝜇𝐷 = 0
Effect size
- Also use Cohen’s d
- Formula is a little different:
H 0: s12 = s22
H1: s12 ≠ s2
o Solution:
Do that in JASP
Bayesian testing
Inferential statistics: NHST steps
1. Formulate a hypothesis
2. Choose test statistic and compute its value
3. Calculate the p-value
4. Make a decision about Ho
5. State the conclusion
- The p-value measures:
o Given the null hypothesis is true, what is the chance of
observing the date we observed?
- In order to use NHST, researchers must make many assumptions
- E.g.
o Distribution population is normal
o Variances in two populations are equal
o Null hypothesis is true
- Some researchers prefer not to make so many assumptions
Hypothesis testing
- An alternative way to test hypotheses is called
o Bayesian Testing
- What is the idea behind Bayesian testing?
o In Bayesian testing, we calculate:
Given the date we onserved, what is the chance the null
hypothesis is true?
o Compared to NHST:
Given the null hypothesis is true, what is the chance of
observing the data we observed what the p-value
measures
Bayesian testing
- In Bayesian testing, we do not report a p-value
- In bayesion testign we report what is called the Bayes Factor
o It measures how much more likely is the null hypothesis as
compared to the alternative hypothesis, given the oberved
date
o The Bayes Factor (BF) measures this using ratio (fraction)
o In Bayesian statistics we look at the relative support for one
hypothesis over the other:
Interval estimation
- In NHST
o Interval estimate to give the reader an idea of the size of the
effect:
Confidence interval
o In Bayesian testing:
Credible interval
Correlation
- Bayesian statistics can also be used for other analysis techniques
- E.g. correlation
- Using Pearson’s r and BF instead of Pearson’s r and p-value
Catergorical data
- Instead of X2 you get BF10 (10 is standard in JASP) independent
multinomial
Replication
Studies
Reproductability Project
- Researchers from all over the world collaborated to replicate 100
emperical studies from three top psychology journals, such as:
o Psychological science
o Journal of personality and social psychology
o Journal of experimental psychology
o Could they reproduce the results of the original studies?
- Significance
o In allmost all original studies the null hypothesis was rejected
o Only in 1/3 of the replication studies was the null hypothesis
rejected
- Effect size
o The effect sizes (like Cohen’s d) were only half as large in the
replication studies as in the original studies
- Non-profit technology organisation with a mission to “increase the
openness, integrity, and reproducubility of scientific research”
- Replication studies
o Three types:
Direct replication
Adavantages:
o Easy to compare
Disadvantages:
o Problems with internal validity in original
research will still be present
Conceptual replication
Advantages:
o Ability to improve design
o Increase internal validity
Disadvantages:
o Not as easy to compare
Replication-plus-extension
Advantages:
o Possibility to examine additional research
questions
Disadvantages:
o Not as easy to compare