EVSC 445 Week 11
EVSC 445 Week 11
25
20
15
Y
y
10
0
0 5 10 15 20
I If β̂0 = β̂intercept = 1.0, β̂1 = β̂slope = 1.2 X
x
– We used this model to calculate the fitted (expected) immune marker
value ŷ for someone who eats 6 units of junk food.
25
20
15
Y
y
10
0
0 5 10 15 20
I If β̂0 = β̂intercept = 1.0, β̂1 = β̂slope = 1.2 X
x
– We used this model to calculate the fitted (expected) immune marker
value ŷ for someone who eats 6 units of junk food.
– ŷ = β̂0 + β̂1 x
– ŷ = 1.0 + 1.2 (6) = 8.2 March 26, 2024 3 / 31
Regression ANOVA Sums of Squares Table
> anova(lm(y~x))
Analysis of Variance Table
ANOVA table
Source of variation df Sum of Squares Mean Square Fstat
Covariate x 1 1108.80 1108.80 75.686
Error 19 278.35 14.65 –
Total 20 1387.15 – –
ANOVA table
Source of variation df Sum of Squares Mean Square Fstat
Covariate x 1 1108.80 1108.80 75.686
Error 19 278.35 14.65 –
Total 20 1387.15 – –
> summary(lm(y~x))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.0000 1.6125 0.62 0.543
x 1.2000 0.1379 8.70 4.72e-08 ***
---
> summary(lm(y~x))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.0000 1.6125 0.62 0.543
x 1.2000 0.1379 8.70 4.72e-08 ***
---
What are the Degrees of Freedom for β̂0 and β̂1 in the t-tests?
> summary(lm(y~x))
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.0000 1.6125 0.62 0.543
x 1.2000 0.1379 8.70 4.72e-08 ***
---
What are the Degrees of Freedom for β̂0 and β̂1 in the t-tests?
I dfβ0 = observation number - number of parameters = 21 − 2 = 19
I dfβ1 = observation number - number of parameters = 21 − 2 = 19
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.0000 1.6125 0.62 0.543
x 1.2000 0.1379 8.70 4.72e-08 ***
---
I Recall how we calculate the t-statistic for a one-sample t-test where
the statistical hypotheses are HO : µ = 0 vs HA : µ 6= 0.
x̄ − 0 x̄ − 0
t-stat = q =
var (x) Std.Error(x)
n
β̂0 − 0 1.0
t-stat = = = 0.62
Std.Error(β̂0 ) 1.6125
March 26, 2024 6 / 31
Understanding the Output of Linear Regression Table
Analysis of Variance Table
Response: y.
Df Sum Sq Mean Sq F value Pr(>F)
x 1 1108.80 1108.80 75.687 4.719e-08 ***
Residuals 19 278.35 14.65
.
.
.
Residual standard error: 3.827 on 19 degrees of freedom
Multiple R-squared: 0.7993,Adjusted R-squared: 0.7888
F-statistic: 75.69 on 1 and 19 DF, p-value: 4.719e-08
Response: y.
Df Sum Sq Mean Sq F value Pr(>F)
x 1 1108.80 1108.80 75.687 4.719e-08 ***
Residuals 19 278.35 14.65
.
.
.
Residual standard error: 3.827 on 19 degrees of freedom
Multiple R-squared: 0.7993,Adjusted R-squared: 0.7888
F-statistic: 75.69 on 1 and 19 DF, p-value: 4.719e-08
yi = β̂0 + β̂1 x1 + i
ŷi = β̂0 + β̂1 x1
yi = β̂0 + β̂1 X1 + i
where X1 is the only independent variable.
yi = β̂0 + β̂1 X1 + i
where X1 is the only independent variable.
The data table takes the form (where each row is one animal) :
Y X1 X2 ... Xp
y1 x11 x12 ... x1p
y2 x21 x22 ... x2p
.. .. .. .. ..
. . . . .
yn xn1 xn2 ... xnp
March 26, 2024 16 / 31
Interpreting Multiple Regression
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.3541424 1.0956079 2.149 0.042922 *
number.of.steps 0.0016184 0.0001708 9.473 3.2e-09 ***
age 0.0143239 0.0036154 3.962 0.000662 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
This table tells us the estimates for β̂0 , β̂1 , β̂2 , providing the direction and
the magnitude of the relationship, as well as p-values for our HO ’s.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.3541424 1.0956079 2.149 0.042922 *
number.of.steps 0.0016184 0.0001708 9.473 3.2e-09 ***
age 0.0143239 0.0036154 3.962 0.000662 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
This table tells us the estimates for β̂0 , β̂1 , β̂2 , providing the direction and
the magnitude of the relationship, as well as p-values for our HO ’s.
How do we use the model to find the predicted values, ŷ , time to migrate?
I III. Auto-correlation
model.null = lm(Longnose ~ 1,
data=Data)
model.full = lm(Longnose ~ Acerage + DO2 + Maxdepth + NO3 + SO4 + Temp,
data=Data)
step(model.null,
scope = list(upper=model.full),
direction="both",
data=Data)
I Which model? We select the model with the smallest AIC value.
I Data transformations
I taking logs: It is possible to take the log of both the response variable
or any covariates for which their distributions do not look normal (add
some small value if the variable = 0.
I generalized linear models (GLMs): A rich modeling world where data
no longer are normally distributed (although, the normal distribution is
just a special case).
I logistic regression for binary outcomes
I poisson regression for count data