0% found this document useful (0 votes)
2 views

Midterm2

Uploaded by

孫利東
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Midterm2

Uploaded by

孫利東
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Second Midterm Exam 2024/04/29

Note A. For questions of hypothesis testing, you need to 1. Explain your parameters
and specify the hypotheses about the parameters. 2. Choose the right testing
statistics, write down its formula and specify its null distribution. 3. Calculate the
critical value for your rejection region. You should clearly specify the degrees of
freedom of the quantiles when it is about t, F and 𝜒 2 distribution. 4. Make your
statistical conclusion. 5. Make the practical conclusion.
Note B. Please conduct hypothesis testing with significance level 0.05 if not
otherwise specified.
Note C. Please use the average value whenever interpolation is needed. For example,
use 0.845 as the 80th percentile of N(0,1).
Note D. For any calculation with decimal numbers, three effective decimal places are
good enough.
Note E. Please make distribution assumptions for your random variables when
needed.
Note F. For quantiles of t-distribution, if the degree of freedom is greater than 30,
please use the normal quantile as approximation. You should first specify the correct
t-quantiles with the right degree of freedom and then declare the quantile you will
adopt in the normal approximation.
1. (5 points) Does a high value of R2 imply that two variables are causally related?
Why or why not?
2. (5 points) A residual plot for a simple linear regression is as follows. What are the
problems that can be detected from the plot?
(a) Misspecification of the mean model (b) Heteroscedasticity (c) Deviation from
normal distribution
3. To investigate the relationship between the car’s mileage and the sales price for a
2007 model year Camry, the following data show the mileage and sale price for
19 sales. The scatter plot below suggests a linear relationship between miles and
prices.

The simple linear regression model fitted for the above data with Miles as
explanatory variable and the Price as the response resulted in the following
outputs.
R2=0.5387
ANOVA table:
Df Sum Sq Mean Sq F value Pr(>F)
Miles 47.158 (c) 0.000348
Residuals (a) (b)

Coefficient table:

Estimate Std. Error t value Pr(>|t|)


(Intercept) 16.46976 2.99E-12
Miles -0.05877 (d) 0.000348
(a) –(d) (20 points) Please fill in the cells in the above tables.
(e) (5 points) What is the correlation coefficient between Miles and Price?
(f) (5 points) What are the hypotheses (H0, H1) tested with the F-statistics and T-
statistics in the above tables? Please specify the model and the parameter in
your hypotheses.
(g) (5 points) How would you make the conclusion about the linear relationship
between Miles and Price in the above analysis?
(h) (5 points) According to the data we have mean mileage 66,737 miles and the
mean price $12,547. If we want to estimate the mean price for cars with
40,000 miles and the mean price for cars with 50,000 miles, which one will
result in a narrower confidence interval? Provide your reasons.
4. The following table consists of the job satisfaction scores for individuals with four
job types. The data was collected according to completely randomized design.

To test if the mean satisfaction scores are all the same across the four job types,
we derive the one-way ANOVA table as follows.
Df SS MS F Pr(>F)
Treatment (a) 4.8661 0.006081
Residuals 4782.6 (c)
Total (b)

(a) –(c) (15 points) Please fill in the cells in the ANOVA table.
(d) (5 points) What are the hypotheses (H0, H1) tested with the above F-statistic?
Please specify the model and the parameter in your hypotheses.
(e) (5 points) With the significant result above, we need to test which two jobs
have significant difference of satisfaction scores. How to adjust the
significance level for each comparison with Bonferroni’s correction?
5. A factorial experiment was designed to test for any significant differences in the
time needed to perform English to foreign language translations with two
computerized language translators. Because the type of language translated was
also considered a significant factor, translations were made with both systems for
three different languages: Spanish, French, and German.
Time Language
(hour) Spanish French German
8 10 12
System1 12 14 16
10 12 14
6 14 16
System2 10 16 22
8 15 19
(a) (7 points) Please make an interaction plot between the system factor and the
language factor. Is there any interaction effect? Please explain your answer.
(b) –(c) (15 points) The following table is a two-way ANOVA table for the analysis.
Please fill in the cells in the table.
Df Sum Sq Mean Sq F value Pr(>F)
language 85.5 0.000161
system 18 0.064206
interaction 4.5 0.034815
Residuals (b) 4.333
Total (c)
(d) (3 points) Please make conclusions with respect to all the main effects and
interaction effect.
6. (5 points) What are the assumptions made for the one-way analysis of variance?

Appendix
Margin of error:

𝑡𝑛−2, 𝛼/2 √𝑠𝜀2 /(𝑛 − 1)𝑠𝑥2

1 (𝑥𝑔 −𝑥̅ )2
𝑡𝑛−2, 𝛼/2 𝑠𝜀 √𝑛 + (𝑛−1)𝑠2 ,
𝑥

1 (𝑥𝑔 −𝑥̅ )2
𝑡𝑛−2, 𝛼/2 𝑠𝜀 √1 + 𝑛 + (𝑛−1)𝑠2 .
𝑥

You might also like