2802 Key Points
2802 Key Points
Parametric Tests (T-Test and ANOVA) – Interval (Rank & Distance, Celsius) or Ratio (True Zero, Kelvin)
Make Strong Assumptions about the Population
Non-Parametric – Nominal (Labels & Categories, Likert) or Ordinal (Rank, Podium) Deviate from
Normality
o
Measures how far a set of (random) numbers are spread out from their average
value.
o So, the larger the variance, the larger the range in a data set.
o Variance emphasizes outliers more than standard deviation does.
o With variance (because of the squaring operation), not in same measuring unit as its
original data set anymore. SD returns to original units.
The experimenter exercises control by manipulating the independent variable and
holding all other factors constant to determine whether the dependent variable changes
in response.
o Independent variable (IV): A variable manipulated by the experimenter to observe
its effect on a dependent variable.
o Dependent variable (DV): The factor of interest being measured in the experiment
- Environmental manipulation: Alters the participants’ physical or social context for each
level of the independent variable.
- Instructional manipulation: Provides different directions for each level of the independent
variable.
- Stimulus manipulation: Uses different stimuli for each level of the independent variable.
- Invasive manipulation: Uses the administration of drugs or surgery to create physical
changes within the participant for each level of the independent variable
- Experimenters use a manipulation check to test whether the manipulation of the independent
variable was effective and elicited the expected differences between conditions.
Between Subjects Design
Assumptions: Interval/Ratio
Data, random sampling, independence of cases, normality
- Dependent T-test (comparing two means from a sample tested under two conditions, null is
mean difference = 0, df = n-1, Subjects go through A and B together. Requires fewer
participants, must use same participants each treatment, creates carryover effects)
- The between-subjects design is conceptually simpler, avoids order/carryover effects, and
minimizes the time and effort of each participant. The within-subjects design is more efficient for
the researcher and controls extraneous participant variables.
Latin Squares Counterbalancing
-
- Each condition occurs once in each column and once in each row
- The number of possible orders will always equal the number of experimental conditions
Matched Group Design
- On top example, lower variance between groups, and higher variance within groups
- On low example, higher variance between groups, and lower variance between groups
One-Way ANOVA
- An experiment in which only one variable is manipulated
- Must have at least two levels (though typically use t-test in this case). An independent
variable is a factor, but a level of the IV/variable is the number of conditions/treatments
- Usually used when there is 1 IV with 3+ levels
- Different Types
o Randomized groups
o Matched-subjects
o Within Subjects/Repeated measures
T-test can only compare 2 groups, so with 3+ groups, we use ANOVA
Technically, we could run a bunch of t-tests if we wanted…what issue do we run into with this?
- How can we determine whether or not there are any significant differences between these
means? We could run 10 t-tests, comparing each mean to every other in the group. The more
t-tests we run, the higher the chances of an error. Since there are more alphas, there are more
opportunities for the alpha to be incorrect.
Two-way ANOVA is for 2 IVs
The Bonferroni Adjustment
- Divide the desired alpha level by the number of tests you plan to conduct
- So in the previous example, we would divide an alpha of 0.05 by 10, making a more
conservative test using alpha = .005
- This reduces the probability of making a Type I error, BUT can increase risk of making a
Type II error, by reducing statistical power. This is an acceptable tradeoff in some conditions.
- Because of the risk of increasing Type II error, researchers typically use the Bonferroni
adjustment when they plan to conduct only a few statistical tests. If a large number of
comparison between means is planned, it makes more sense to use an ANOVA. ANOVA will
compare all means simultaneously to determine if any differences are present…this allows
us to hold alpha at 0.05 (or whichever alpha is chosen)
Type 1 Error:
o Alpha
o Probability of accepting HA when H0 is true
o Ie: Deciding there is an effect when there really isn’t
Type 2 Error:
o Beta
o Probability of accepting H0 when HA is true.
o Ie: Deciding there is not an effect when there actually is.
o A Type 1 Error is much worse to commit than a Type 2 Error, because a Type 2 Error
is ignoring an effect, which is an easier fix, whereas Type 1 is finding a false positive.
- Power:
o 1-Beta
o Probability of finding an effect, given that effect exists (Inverse of a Type 2 Error)
o If your test is less sensitive, your test will not recognize the effect
Sensitivity measures the ability of a test to correctly identify true positives,
particularly in diagnostic testing. Power gauges the likelihood that a
statistical test will detect a true effect when one exists, mainly in hypothesis
testing and experimental design
If the variance between experimental conditions is markedly greater than the variance within the
conditions, it suggests the independent variable is causing the difference
- F-test: Ratio of the variance among conditions (between-groups variance) to the variance
within conditions (within-groups, or error, variance)
- ***Because all data points in a single group have been treated with the same IV, it is
impossible for them to contribute to systematic variance…within group variance is therefore
treated as error.***
We expect lower WG variance, and higher BG variance. If the proportion of BG variance is high enough,
results are significant, and we can reject the null hypothesis
(n is sample size)
The numerator of the F-statistic (MST) must be large relative to its
denominator (MSE) in order for it to reach significance
Values near 1 indicate that variation between treatment and error is
approximately equal.
This is a ratio of treatment variance to error variance (Signal to Noise)
The F-Distribution
- The F-test seeks to find a difference amongst the means (note that that it does not specify
exactly which means are differ significantly)
- Regardless, the F-test will always be a one-tailed, upper bound test.
-
- Knowing the distribution of F when the null hypothesis is true allows us to find the p value.
Raw effects (or unstandardized effects): Straightforward measures of effect size, such as the
difference in the means for two samples.
o E.g., Two groups of participants have mean IQ scores of 101 and 113, respectively.
The raw effect of the difference of means would be 12.
- Standardized effects: Adjust the raw effect based on the amount of variability in the data.
- Cohen’s d: A common standardized effect size that is defined as the raw effect divided by its
standard deviation. Indicator of the practical value of results
o E.g., If we continue with the IQ example and find that the standard deviation for our
data values is 15, then Cohen’s d would be the raw effect of 12 divided by the
standard deviation of 15 to give a standardized effect size of 0.8.
- Standardized effects have two main advantages over raw effects:
o 1) Can readily be compared across studies even when the specific measures used are
on different scales.
E.g., Cohen’s d computed on data using a measurement scale ranging from 0
to 20 can be meaningfully compared to Cohen’s d computed on data from a 0
to 100 scale.
o 2) Can be interpreted with respect to the standard deviation.
E.g. Cohen’s D of 0.8 tells us that the two groups differ by just under 1 SD
- Correlation-like effects: Measure the association between two variables.
o Ex: Pearson’s r, R2
- Correlation-like measures such as eta squared and omega squared are appropriate for analysis
of variance designs. They are all standardized
- If we square a correlation coefficient (R2) , we obtain the proportion of the total variance in
one set of scores that is systematic variance related to another set of scores.
Eta-Squared
- Effect size commonly reported with ANOVA
o Sums of squares treatment/Sums of Squares Total (or just look at your output!)
o Value always between 0-1
o Larger values = higher proportion of variance attributed to IV
- .01: Small effect size
- .06: Medium effect size
- .14 or higher: Large effect size
Confidence Intervals
- Confidence interval: A range of values around the effect size obtained in your sample that is
likely to contain the true population effect size with a given level of plausibility or
confidence.
o A 95% confidence interval is commonly used in behavioural research
we were able to identify the true population value of our measure of interest
(e.g., a difference of two group means), there is a 95% chance that our
confidence interval would contain that true value.
- The size of a confidence interval depends on both the variability in your data and your sample
size.
Basic Factorial Designs: The 2 x 2
- The most basic factorial design is the 2 x 2.
- Each number refers to a factor, or independent variable.
o 2 x 2 = 2 IV, 2 levels each
o 2 x 2 x 2 = 3 IV, 2 levels each
o 2 x 2 x 3 = 3 IV, two with 2 levels and one with 3 levels
- The number of conditions in a factorial design – or all possible combinations of the
independent variables – can be computed by multiplying the numbers of levels of your
different factors.
o A 2 x 2 design has four possible conditions.
o A 2 x 3 design has six possible conditions.
o A 2 x 2 x 2 design has eight possible conditions
- Each condition – referred to as a cell – represents a unique combination of the levels of the
independent variables.
- In a 2x2 Design, there are 2 possible main effects and one interactions, 3 tests.
- In a 2x2x2, there are 3 possible main effects and three 2-way interactions, one 3-way
interaction, so 7 tests.
- Spreading Interactions
o IV B has an effect on level 1 of IV A, but not at level 2.
o IV B has a stronger effect on level 1 of IV A than level 2
o
o One line represents one half of the table, (one set of red and blue)
- Cross-Over Interactions
o IV B has an effect on both levels of IV A, but in different directions. (Positive on
level 1, negative on level 2).
o
An interaction simply informs us that the effects of at least one independent variable depend on the level
of another independent variable. Whenever an interaction is detected, researchers need to conduct
additional analyses to determine where that interaction is coming from.
Monitor the IV at each level of the other IV, looking for effects.
It is only necessary to look for simple effects when an interaction is present.
You look for simple effects for each condition, a 2x2 design would have 4 potential simple effects
Main effect of weather Main effect
of task
-
Law of Large Numbers
- As we observe more results, the average gets closer to our theoretical mean
o In a normal distribution, the standard deviation is meaningful.
o 68% of values fall within 1 SD of the mean
o 95% of values fall within 2 SD
o 99% fall within 3SD
- Sampling distribution of the mean: The pattern of mean values obtained when drawing
many random samples of a given size from a population and computing the mean for each
sample.
- An important property of sampling distributions is that as sample size increases, the
variability (variance, standard deviation) of the sampling distribution decreases.
- Standard error: The standard deviation of a sampling distribution. This value is calculated
by dividing the standard deviation by the square root of the sample size.
o As a result, the larger the sample size, the greater the precision in our estimates and
the smaller our p value will be.
- The standard deviation (SD) measures the amount of variability, or dispersion, from the
individual data values to the mean, while the standard error of the mean (SEM) measures
how far the sample mean (average) of the data is likely to be from the true population mean.
- Central limit theorem: A theorem that says with a large sample size, the sampling
distribution of the mean will be normal or nearly normal in shape.
o Even with populations having dramatically non-normal distributions, the sampling
distribution of the mean will be increasingly normal in shape as sample sizes
increase.
o This allows us to make use of the many attractive properties of the normal
distribution
- Statistical assumptions are made based on our understanding of how this works:
o Normality
o Homogeneity of variance
Ex: Levene’s test, Mauchley’s test
Criticisms of Hypothesis Testing
- Three of the common issues surrounding the use of NHST are:
o 1) The overreliance on p as an indicator of effect size or importance.
o 2) The arbitrary nature of a reject/fail-to-reject decision based on p.
o 3) The overemphasis on α and type I errors, leading to underpowered research
studies.
-
- to achieve a power of 0.8 in the presence of a large effect, 25 participants are required for
each group
- for a small effect, 393 participants are required for each group!
- To find a small effect, with high power, a large sample size is required. To find a large effect,
with high power, a small sample size is required.
- Small effect + High power = Large sample size.
- Medium effect + High power = Medium sample size
- Large effect + High power = Small sample size.
Effect size is the magnitude of the effect, power is your ability to recognize it.
The more conditions you add, the less ability to capture an effect in each condition. Power must be
distributed. Because of this, within-subjects design maximizes power.
- Between Groups:
o Two groups = independent t-test
o 3+ groups = One-way ANOVA
- Within Groups:
o Two sets of observations = dependent/paired t-test (could use anova)
o 3+ sets of observations = single factor repeated measures
o Can also be Randomized Blocks
- Factorial Designs:
o factorial designs can be within subjects, or mixed (both between and within subjects
factors)
-
- One-way ANOVA is not appropriate for within-subjects designs in which the means being
compared come from the same participants tested under different conditions or at different
times.
- The main difference is that measuring the dependent variable multiple times for each
participant allows for a more refined measure of MSE
- In a between-subjects design, these stable individual differences would simply add to the
variability within the groups and increase the value of MSE (which would, in turn, decrease
the value of F). In a within-subjects design, however, these stable individual differences can
be measured and subtracted from the value of MSE. This lower value of MSE means a higher
value of F and a more sensitive test.
Repeated Measures
- sometimes also called “within-subjects” or “within-participants”
- the same participant is measured on the same dependent variable multiple times (more than 2)
- (if only 2 measurements just use a paired samples t-test)
- e.g. the same participant is measured on their mood (1) before and (2) after a treatment and
then (3) again after a week
- effects of placebo vs treatment A vs treatment B on blood pressure can be studied in the same
participants, each participant can serve as their own control
- behaviour of subjects can be studied over multiple time points
Assumptions
- Independent random sampling
- Normality
- Circularity of the covariates, (Sphericity)
- Null hypothesis
One Way ANOVA
-
-
-
- Since the error term is reduced, the denominator of the F ratio is reduced, increasing F.
- Because big F values usually let us reject the idea that differences in our means are due to
chance, the repeated-measures ANOVA becomes a more sensitive test of the differences
(its F-values are usually larger).
- The repeated measures ANOVA uses different degrees of freedom for the error term, and
these are typically a smaller number of degrees of freedom. So, the F-distributions for the
repeated measures and between-subjects designs are actually different F-distributions,
because they have different degrees of freedom.
Repeated Measures:
o the same individuals are measured multiple times, as one group
o Basic within subjects design
Pros/Cons of Repeated Measures
- Advantages
o Each subject serves as their own control
o Increased Power due to partitioning of error
o More efficient
- Disadvantages
o Memory/fatigue effects
o Carry Over Effects
Ex: alcohol accumulated in blood stream
o Order Effects
0, 2, 4, 6 oz of alcohol vs. 6, 4, 2, 0 oz
Assumptions
- Independent Random Sampling
- Normality
- Homogeneity of Variance
o Between groups:
homogeneity of variance
Equivalence of covariance matrices (Box’s test)
o Within groups:
assumption of circularity
Use Greenhouse Geisser Correction if there are more than 2 levels!
- Null hypotheses
o Two sets: interaction and main effects
Ex: A (between subjects main effect), B (within subjects main effect), AB (interaction)
o Error term for between subject is distinct from error term used within subjects
o Between subjects x within subjects interaction is considered a within subjects effect
So A is a between subjects effect, but B and AB are within subjects
- Example A shows a between subjects design. Each subject only uses 1 brand.
o Uses 40 sample size, while B uses 10.
- In example B, each golfer is assigned to a block, but still experiences all 4 brands.
o If you want to stay strictly within subjects, you will use a randomized block design.
-
Split Plots
- 2 or more factors (IVs)
o At least one is independent/between groups
o At least one is repeated measures/within groups
- We’ll stick with the most simple option:
o 2 factor split-plot design (1 between, 1 within)
- Again…NOT the same as 2 one-way anovas
o Looking for interaction between the 2 factors
o Main effects of secondary interest and must be interpreted in light of the interaction
o Post Hoc on main effects or simple main effects. (The differences in one variable at
every level of the other)
Example
- A personality researcher thought that women are more worried than men about what people
think about them. To test this, he looked at males and females nose picking behaviour as a
function of how many people were potentially watching. To do this experiment, they used a
doctor’s office waiting room. They left each subject (who was actually going to see the
doctor) in the waiting room for 40 minutes. While the subject waited, other people came and
left. The experimenter engineered it so that, out of the 40 minutes there were 0,1,2 or 3 other
people in the room for 10 minutes each. Through a two-way mirror, an independent judge
recorded the number of times that the subject inserted a finger into either of their nostrils.
-
Greenhouse-Geisser
- Makes test more conservative
- Lowers DFs (does not change F)
o Lowering DFs makes associated p-value higher (ie: closer to 1.0), because you are
getting p-value from F-distribution based on lower DFs
- Report Greenhouse-Geisser for interaction and main effect of within-subjects factor if within
subjects factor has more than 2 levels
o If 2 levels, no need for correction (no risk of inflation)
o Greenhouse Geisser and “Sphericity Assumed” will be identical if only two levels.
-
- Generic design, the individuals are exposed to different levels of the dependent variable over
time. This is called a reversal design, following a A-B-A format.
o The change from one condition to the next does not usually occur after a fixed
amount of time or number of observations. Instead, it depends on the participant’s
behavior. Specifically, the researcher waits until the participant’s behavior in one
condition becomes fairly consistent from observation to observation before changing
conditions. This is sometimes referred to as the steady state strategy. This is under
the notion that when the participant’s behaviour becomes steady in one condition, it
will be easier to recognize a change in another condition.
o The effect of an independent variable is easier to detect when the “noise” in the data
is minimized.
Reversal Design
- In a basic reversal design, a baseline for the dependent variable is established before
treatment introduction, serving as a control condition (phase A). Once steady state responding
is reached, phase B begins with treatment introduction. The researcher waits for the
dependent variable to stabilize to assess changes. The design can include treatment
reintroduction (ABAB) or further baseline phases. In such designs, the levels of A may differ
upon reintroduction due to residual excitement, emphasizing the need for B to reach stability
before returning to A. Reversal increases internal validity by demonstrating that changes in
the dependent variable coincide with treatment introduction and removal, suggesting causal
relationships and minimizing the influence of extraneous variables.
- In a multiple-treatment reversal design, a baseline phase is followed by separate phases in
which different treatments are introduced. (ABCACBA). The participant could then be
returned to a baseline phase before reintroducing each treatment—perhaps in the reverse
order as a way of controlling for carryover effects.
- In an alternating treatments design, two or more treatments are alternated relatively quickly
on a regular schedule. (ABCBC)
A measure of the strength of the relationship between two variables (behaviours, beliefs, etc) Variables
are “things that can change”
- Asks the question: “Do people with high (or low) scores on X also tend to have high (or low)
scores on Y?”
- *Note that while we were using categorical with Anova and t-tests, correlation uses
continuous IVs)
- Correlations establish an association, not causation
o So…why not skip this and just do experimental research?
- May be starting point for future research
- Topic may be unethical or impractical to manipulate
o Amount of smoking and work productivity
Categorical variables represent distinct categories or groups and can only
take on a limited number of values. Examples include gender (male/female),
color (red/blue/green), and type of car (sedan/SUV/truck).
Continuous variables can take on any value within a range and are often
measured on a scale. Examples include height, weight, temperature, and age.
They can have infinite possible values within a given range.
-
Non-equivalent Groups Design
- A nonequivalent groups design, is a between-subjects design in which participants have not
been randomly assigned to conditions.
o In the posttest only nonequivalent groups design, participants in one group are
exposed to a treatment, a nonequivalent group is not exposed to the treatment, and
then the two groups are compared.
o In the pretest-posttest nonequivalent groups design there is a treatment group that
is given a pretest, receives a treatment, and then is given a posttest. But at the same
time there is a nonequivalent control group that is given a pretest, does not receive
the treatment, and then is given a posttest.
Qualitative Research
- collects large amounts of data from a small number of participants, often exploratory
- analyses the data nonstatistically.
Complex Correlational Designs
- When you can’t run an experiment, but you still want to understand the (probable) cause
- Two improvements we could make:
o Track things over time (longitudinal research)
o Control for confounding variables (multiple predictors)
Longitudinal Designs
- Allow you to see if changes in X precede changes in Y
o Establish directionality
o E.g., “smartphone use is associated with depression”
Which comes first?
Longitudinal Designs vs Correlational Designs
- The advantages of longitudinal designs is that it gives you a better sense of the directionality
between two variables.
o Which one changes first?
o Which comes before the other?
- With a simple cross-sectional design, you can see that two variables are related, but you don’t
know which might have caused which.
o When you measure them both repeatedly, the time course of the variables becomes
clearer.
Alternative Explanations
- Longitudinal designs can’t eliminate third variable problem.
- What if some other, third variable is causing both?
o Physical Activity Level
o Family involvement?
o Stress?
- Solution? Maybe using multiple predictors! (ie: multiple Ivs)
- (Still no experimental manipulation, only measured variables)
Correlation Matrix
-
- The numerical value in each box represents the correlation (r)
o Ranges from -1.0 to +1.0
o Value of .2 is modest, .5 is quite large
Significance is indicated with asterisks
- * means p < .05
- ** means p < .01
- *** means p < .001
Linear Regression
- Attempts to “predict” Y using X, by fitting a regression line
o Y = b(X) + error
- 1 Predictor and 1 DV
- Both continuous in nature
Multiple Regression
- With just one predictor, regression is just like correlation
- Advantage of regression: you can have multiple predictors
o Y = b1(X1) + b2(X2) + b3(X3) + error
Each b makes its own independent contribution to the model, over and above
other predictors
- DV must be continuous
-
-
-
IV’s as statistical control
- What is the effect of X1 on Y, over and above the effects of X2 and X3 on Y?
- The variables you control for are called covariates
- They’re expected to co-vary with your main IV!
- In our current example, we might want to consider the effects of screen time on depression,
while controlling for physical activity
o covariates are additional variables that are taken into account in statistical models to
make sure that the effect of the primary independent variable(s) on the dependent
variable is accurately estimated, considering the potential influence of other relevant
factors.
o For example, in a study examining the effect of a new teaching method (independent
variable) on student performance (dependent variable), factors such as prior academic
achievement, socioeconomic status, or student motivation might be considered as
covariates to control for their potential influence on student performance.
Survey Research
- Quantitative and Qualitative method with two important characteristics
o Measured using self-reports
o Lots of emphasis put on sampling, often random.
- Most survey research is non-experimental. It is used to describe single variables, without any
manipulation.
Context Effects on Survey Responses
- Complexity can lead to unintended influences on respondents’ answers. These are often
referred to as context effects because they are not related to the content of the item but to the
context in which the item appears.
o For example, there is an item-order effect when the order in which the items are
presented affects people’s responses. One item can change how participants interpret
a later item or change the information that they retrieve to respond to later items
o For example, researcher Fritz Strack and his colleagues asked college students about
both their general life satisfaction and their dating frequency. When the life
satisfaction item came first, the correlation between the two was only −.12,
suggesting that the two variables are only weakly related. But when the dating
frequency item came first, the correlation between the two was +.66, suggesting that
those who date more have a strong tendency to be more satisfied with their lives.
Reporting the dating frequency first made that information more accessible in
memory so that they were more likely to base their life satisfaction rating on it.
o The response options provided can also have unintended effects on people’s
responses. For example, when people are asked how often they are “really irritated”
and given response options ranging from “less than once a year” to “more than once a
month,” they tend to think of major irritations and report being irritated infrequently.
But when they are given response options ranging from “less than once a day” to
“several times a month,” they tend to think of minor irritations and report being
irritated frequently. People also tend to assume that middle response options
represent what is normal or typical. So if they think of themselves as normal or
typical, they tend to choose middle response options. For example, people are likely
to report watching more television when the response options are centered on a
middle option of 4 hours than when centered on a middle option of 2 hours. To
mitigate against order effects, rotate questions and response items when there is no
natural order. Counterbalancing or randomizing the order of presentation of the
questions in online surveys are good practices for survey questions and can reduce
response order effects that show that among undecided voters, the first candidate
listed in a ballot receives a 2.5% boost simply by virtue of being listed first
Types of Survey Items
- Questionnaire items can be either open-ended or closed-ended.
o Open-ended: Fill in a unique response (qualitative)
o Closed-ended: multiple choice (quantitative)
Likert Scale: Present people with a statement. They respond on a scale of
strongly disagree to strongly agree
For closed-ended items, it is also important to create an appropriate response
scale. For categorical variables, the categories presented should generally be
mutually exclusive and exhaustive.
- Effective items: An acronym, BRUSO stands for “brief,” “relevant,” “unambiguous,”
“specific,” and “objective.
Formatting a Survey
- Every survey should have a written or spoken introduction that serves two basic functions
o One is to encourage respondents to participate in the survey
o The second function of the introduction is to establish informed consent. Remember
that this involves describing to respondents everything that might affect their decision
to participate.
Sampling
-
- Once the population has been specified, probability sampling requires a sampling frame.
This sampling frame is essentially a list of all the members of the population from which to
select the respondents.
- There are a variety of different probability sampling methods. Simple random sampling is
done in such a way that each individual in the population has an equal probability of being
selected for the sample.
- A common alternative to simple random sampling is stratified random sampling, in which
the population is divided into different subgroups or “strata” (usually based on demographic
characteristics) and then a random sample is taken from each “stratum.”
- Proportionate stratified random sampling can be used to select a sample in which the
proportion of respondents in each of various subgroups matches the proportion in the
population.
- Disproportionate stratified random sampling can also be used to sample extra respondents
from particularly small subgroups—allowing valid conclusions to be drawn about those
subgroups.
- Yet another type of probability sampling is cluster sampling, in which larger clusters of
individuals are randomly sampled and then individuals within each cluster are randomly
sampled. This is the only probability sampling method that does not require a sampling
frame.
Sample Size and Population Size
- Confidence intervals depend only on the size of the sample and not on the size of the
population. So a sample of 1,000 would produce a 95% confidence interval of 47 to 53
regardless of whether the population size was a hundred thousand, a million, or a hundred
million.
Bias
- Sampling bias occurs when a sample is selected in such a way that it is not representative of
the entire population and therefore produces inaccurate results.
- If these survey non-responders differ from survey responders in systematic ways, then this
difference can produce non-response bias
-
-
-
-
-
Time Variables
- Confounds are variables that might explain your effect
o X predicts Y, and that’s because by Z
- Moderators are variables that qualify your effect
o X predicts Y, but only under Z circumstances