0% found this document useful (0 votes)
124 views7 pages

PSYB07 Final Notes

This document contains practice theory questions and explanations for a final exam. It covers topics like ANOVA, t-tests, chi-squared tests, correlations, regressions, and more. Multiple questions are asked about assumptions, differences between tests, and interpreting results for each statistical analysis technique.

Uploaded by

layanh3103
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views7 pages

PSYB07 Final Notes

This document contains practice theory questions and explanations for a final exam. It covers topics like ANOVA, t-tests, chi-squared tests, correlations, regressions, and more. Multiple questions are asked about assumptions, differences between tests, and interpreting results for each statistical analysis technique.

Uploaded by

layanh3103
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

Theory Questions

B07 – PRACTICE THEORY QUESTIONS/PROMPTS


FINAL EXAM:

1. Explain why F = 1 in an ANOVA when the null hypothesis is true.


a. The F ratio is the ratio of two mean square values. If the null hypothesis is true,
you expect F to have a value close to 1.0 most of the time. A large F ratio
means that the variation among group means is more than you'd expect to see
by chance.
2. What is the difference between an ANOVA and an independent samples t-test?
Explain.
a. Only compare 2 groups, with ANOVA you can compare more than 2 groups
b. Can only analyze one independent variable in a t-test and multiple
independent variables in an ANOVA test
c. More and more t tests increases risk of type 1 error,
d. ANOVA tests for interactions between independent variables
3. Why do we need to test for homogeneity of variances when conducting an ANOVA?
a. Homogeneity of variance essentially makes sure that the
distributions of the outcomes in each independent group are
comparable and/or equal. If independent groups are not similar in
this regard, spurious findings can be yielded.
4. Explain the similarities and/or differences between the chi-squared, t, and F
distributions.
a. Chi
i. Two samples
ii. Information: Number of observations and expectation
iii. Distribution: Chi Square distributions are positively skewed, with
the degree of skew decreasing with increasing degrees of
freedom. As the degrees of freedom increases, the Chi Square
distribution approaches a normal distribution.
b. T
i. Two samples
ii. Information: sample mean and sample standard deviation
iii. Distribution: normal
c. F
i. Two or more samples
ii. Information: group size, mean and standard deviation
iii. Distribution: normal
5. Describe the two ways in which you estimate the population variance in an ANOVA.
In what circumstances are these estimates biased versus unbiased?
a. Construct a test statistic that is a ratio of two different and independent
estimates of an assumed common variance among populations. The
numerator estimate is based on sample means and variation among
groups. The denominator estimate is based on variation within
samples.
6. Why do the critical values for a chi-squared distribution get larger as the degrees
of freedom get larger (as opposed to the t and F distributions, in which larger
degrees of freedom yield smaller critical values)?
a. The d.f. is the number of opportunities that the statistic had to be large.
More opportunities come from adding variables or adding categories in
a frequency (contingency) table. The more opportunities (d.f.) the
higher the critical value you must achieve to have evidence above the
noise level.
7. Explain the difference between parametric and non-parametric tests. When is it
appropriate versus inappropriate to use each type of test, and why?
a. Parametric tests make assumptions about the parameters of the
population distribution from a sample. Nonparametric statistics are not
based on assumptions. The data can be collected from a sample that
does not follow a specific distribution
8. Explain how overgeneralization can affect your predicted values in a regression.
a. Overgeneralization can skew the results and make them less accurate.
9. Why do we need to test for linearity when conducting a correlation and regression
analysis? Explain and discuss the concept of residuals in your answer.
a. The concept of linearity explains the line of best fit with the residuals line
on. We see how far off the real values deviate from the expected values in
the
10. Why do you divide by the expected frequencies or probabilities in chi-squared
tests?
a. To normalize bigger and smaller counts (because we don't want a
formula that will give us a bigger Chi-square value just because you're
working with a bigger set of data).
11. How do outliers affect the results of a t-test versus a chi-squared test versus a
correlation/regression analysis?
a. Correlation
i. Affect the strength of the correlation; thus affecting the value of the
correlation
b. Chi square
i. As the degrees of freedom (k) increases, the chi-square
distribution goes from a downward curve to a hump shape. As
the degrees of freedom increases further, the hump goes from
being strongly right-skewed to being approximately normal.
c. T test
i. Outliers tend to increase the estimate of sample variance, thus
decreasing the calculated t statistic and lowering the chance
of rejecting the null hypothesis
12. What are the similarities and/or differences between Pearson’s r and Cohen’s d?
a. Cohen's d measures the size of the difference between two groups
while Pearson's r measures the strength of the relationship between
two variables.
b. D
i. Effect size of a difference
ii. Any value = absolute value taken
iii. Standardized
iv. Conventions: 0.2, 0.5, 0.8
c. R
i. Effect size of a relationship
ii. -1 -1
iii. Standardized
iv. Conventions 0.1, 0.3, 0.5
13. Are ANOVAs one-tailed or two-tailed tests? Explain.
a. Image result for are ANOVAs one-tailed or two-tailed tests? Since the
precision of a 1-tailed test is better than a 2-tailed test, we prefer to use
the 1-tailed test.
Chi-square
● Extension of a binomial test
○ Instead of comparing a single observation against the expected
distribution, you are comparing an observed distribution to the
expected distribution
○ It tests the probability of observing a specific number of frequencies
for two or more categories (relative to the expected frequencies)
■ Tests the extent to which an observed pattern of observations
(frequencies) conforms or fits to an expected pattern. How
well do the observed frequencies fit with the expected
frequencies?

Chisquare assumptions
1. The IV consists of mutually exclusive and exhaustive categories; the
measure is frequency
2. Independence of observations (i.e., one observation per subject; subjects
fit into only one of the mutually-exclusive categories)
3. Expected frequencies for each category are 5 or greater. This is because
the χ2 statistic (which we will cover next) is calculated using frequencies,
which are continuous, and approximates a continuous variable when N is
large

Distribution
1. One-tailed
2. All values are positive because you are squaring the values
3. Begins to look normal when df approached infinity
a. Df = k-1 where k is the number of categories

Effect size = W = √x2/n

Small Medium Large

w 0.1 0.3 0.5

phi 0.1 0.3-0.5 0.5+


Power = you have to solve for N

The x2 test of independence


● Previously, we concluded that two variables were not independent if there
was any change between P(observed) and P(expected)
● However, we know that the probability of P(observed) being exactly
P(expected) is virtually zero, even if the samples are randomly selected,
because of sampling error
● The χ2 test of independence therefore tests how much of a deviation from
the expected is required before we reject the null and conclude that two
variables are not independent

How do we calculate the expected frequencies?


- To calculate the expected frequency for each cell of the contingency table:
row x column / sample (n)

PHI
- This sounds suspiciously like a correlation... That’s because it is one! Phi is
the measure of association between two binary variables. A Pearson
correlation coefficient estimated for two binary variables will return the phi
coefficient.
Correlation and regression
➔ Both are statistical techniques to determine the relationships between two
variables

1. Correlation: determines the strength of an association between two


quantitative variables
2. Simple Regression: predicts one quantitative* dependent variable from the
independent variable
3. Multiple regression: indicates one quantitative* criterion from multiple
predictors.
4. They are not resistant to outliers

Correlation coefficient
● Pearson’s correlation coefficient of a sample = r
● Correlation of a population = p (rho)

Testing a correlation
● You can use a t-test to assess whether or not a relationship actually exists
between two variables

Assumptions
● Dependent and independent variables are continuous
● The DV and IV are normally distributed
● No outliers in DV or IV; no bivariate outliers
○ Correlations are not resistant to outliers
● DV and IV are linearly related
● Correlation must be significant to run regression

The proportion of variance r^2 = variance of the two variables

Cohen’s D Pearson’s r

What is it testing? The effect size of a difference Effect size of a relationship


Range Any absolute value -1 to 1

Formula

Significance Test T-tests: Depends on the test


(One Sample, Independent, or Pairwise)

Standardized? Yes Yes

Conventions S: 0.2 M: 0.5 L:0.8 S:0.1 M:0.3 L:0.5

Process of calculations
- Test the correlation using a t-test

T-test and correlations:


● Null hypothesis: There is no correlation between the 2 variables (ρ= 0)
● Alternative hypothesis: There is a correlation between the 2 variables (ρ≠ 0)

Regression
➔ Associated with observational research
➔ All scores taken into consideration
➔ Y = mx+B
➔ Line of best fit
➔ Residuals
◆ The residuals refer to the difference between your observed y-scores, and the y-scores
generated by your equation
◆ How far off line is at predicting values
ANOVA Test

T-test limitations
● We can only compare two group means at a time
● We can only analyze one IV at a time
● You increase the Type 1 error rate when you do too many tests

Assumptions
1. The dependent variable is continuous and the independent variable is categorical
2. Data and observations are independent
3. DV is normally distributed to all groups
4. Variances are homogeneous for all groups
5. No outliers beyond +/- 4 standard deviations in all groups s

What does the f statistic do?


➔ The F statistic tests whether the variation between groups greater than the
variation within groups

Variance
- Error estimate of variance does not change regardless of if the null is true or not

Error estimate = mean square error or MS error


- Equation = average variance/number of groups that you have (avearage variance)
- If you have unequal n, you have to do a pooled average

Treatment estimate

If the null is true…


1. The f ratio will be <or= to 1
2. If the null is false, f ratio is > 1
a. You need to find a critical value using the f table
i. Df treatment = k-1
ii. Df error = n-k

F ratio: MS Treatment/MS Error


- MS ERROR = VARIANCE
- MS TREATMENT = MEANS

You might also like