Assignment-1 (2)
Assignment-1 (2)
Call:
lm(formula = log(price) ~ log(area) + rooms + baths + age, data = hprice)
Residuals:
Min 1Q Median 3Q Max
-1.3856 -0.1901 0.0122 0.1992 0.8413
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.7588037 0.4649094 14.538 < 2e-16 ***
log(area) 0.5288392 0.0694604 7.614 3.11e-13 ***
rooms 0.0593313 0.0231439 2.564 0.010822 *
baths 0.1190959 0.0348483 3.418 0.000715 ***
age -0.0037630 0.0005464 -6.887 3.09e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
9) Test the null hypothesis H0: none of the explanatory variables has an effect on the
dependent variable, at the 1% level?
INSTRUCTIONS
1) The sample regression function:
^
log ( price)=6.7588+0,5288 log (area)+0.0593 rooms+ 0.1191 baths−0.0038 age
log ( price )=6.7588+0.5288 log ( area ) +0.0593 rooms +0.1191 baths−0.0038 age+ u^
^
2) β 1=0.5288: Holding all other factors fixed, if the square footage is increased by 1%,
then the house price will increase by 0.5288%.
3) Test H0: baths = 0 against H1: baths ≠ 0
Method 1: tbaths = 3.418
Given a significant level = 3% /2 = 0.015
Since n – (k+1) = 316 is high, then t/2(n-(k+1)) z/2 = z0.015 = 2.17
Because |t| = 3.418 > z/2 = 2.17, we reject H0.
So, the number of bathrooms influences the house price.
Method 2: p-value(baths) = 0.000715 < = 3% Reject H0.
So, the number of bathrooms influences the house price.
^ n −( k+1 )
. se ( β^ baths ) ; ^β baths +t n−(k+1 )
. se ( β^ baths ) )
Method 3: β baths ∈( β baths −t α α
2 2
Given a confidence level (1-) = 1 – 0.03 = 0.97 /2 = 0.015
Since n – (k+1) = 316 is high, then t/2(n-(k+1)) z/2 = z0.015 = 2.17
Therefore:
β baths ∈(0.1191−2.17 × 0.0348 ; 0.1191+2.17 × 0.0348)
So: β baths ∈ ¿0.0436; 0.1946)
Since 0 (0.0436; 0.1946), we reject H0.
So, the number of bathrooms influences the house price.
4) The confidence interval is:
β rooms ∈( β^ rooms−t n−(k
α
+1)
. se ( ^β rooms) ; β^ rooms +t n−(k+1
α
)
. se ( β^ rooms ) )
2 2
Given a confidence level (1-) = 98% /2 = 0.01
t/2(n-(k+1)) = t0.01(316) = 2.326
Since n – (k+1) = 316 is high, then t/2(n-(k+1)) z/2 = z0.01 = 2.325
Therefore: β rooms ∈(0.0593−2.326 × 0.0231; 0.0593+ 2.326 ×0.0231)
So: β rooms ∈ (0.0055694; 0.1130306)
5) Test H0: age = 0 against H1: age < 0
Method 1: tage = -6.887
Given a significant level = 2%: Since n – (k+1) = 316 is high, then t(n-(k+1)) z =
z0.02 = 2.055
Because t = -6.887 < -z = -2.055, we reject H0.
So, the higher the age of a house is, the slower the price of that house is.
Method 2: p-value(age) = 3.09x10-11/2 < = 2% Reject H0.
So, the higher the age of a house is, the slower the price of that house is.
6) Test H0: rooms = 0.08 against H1: rooms ≠ 0.08
Method 1: t = (0.0593 – 0.08) / 0.0231 = -0.896
Given a significant level = 1% /2 = 0.005 t/2(n-(k+1)) = t0.005(316) = 2.576
Since n – (k+1) = 316 is high, then t/2(n-(k+1)) z/2 = z0.005 = 2.575
Since |t| = 0.896 < t/2 = 2.576, we cannot reject H0.
Hence, when we compare two houses with the same square footage, same number of
bathrooms and same age, but house A is 1 room higher that house B, then we can predict
that house A’s price is 8% higher than house B’s price. (We agree with this statement)
Method 2: The confidence interval of β rooms is: ….
7) Test H0: log(area) = 0 against H1: log(area) ≠ 0
Test H0: rooms = 0 against H1: rooms ≠ 0
8) Testing exclusion restrictions: H0: log(area) = rooms = 0.
H1: There exists log(area) ≠ 0 or rooms ≠ 0
We have:
Given a significant level = 1% F(q; (n-(k+1))) = F0.01(2; 316) = 4.61
Since F = 37.433 > F = 4.61, we reject H0.
So, log(area) and rooms are jointly significant.
9) Test H0: R2 = 0 against H1: R2 > 0 (Test of overall significance of a regression)
Method 1:
Given a significant level = 1% F(k; (n-(k+1))) = F0.01(4; 316) = 3.32
Since = 110.63 > F = 3.32, we reject H0.
Thus, we can conclude that all the explanatory variables in the regression explain
variation in log(price).
Method 2:
p-value(F) < 2.2x10-16 < = 1% Reject H0.
Thus, we can conclude that all the explanatory variables in the regression explain
variation in log(price).
PRACTICE IN R
Call:
lm(formula = log(price) ~ log(area) + rooms + baths + age, data = hprice)
Residuals:
Min 1Q Median 3Q Max
-1.3856 -0.1901 0.0122 0.1992 0.8413
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.7588037 0.4649094 14.538 < 2e-16 ***
log(area) 0.5288392 0.0694604 7.614 3.11e-13 ***
rooms 0.0593313 0.0231439 2.564 0.010822 *
baths 0.1190959 0.0348483 3.418 0.000715 ***
age -0.0037630 0.0005464 -6.887 3.09e-11 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Title:
Jarque - Bera Normalality Test
Test Results:
STATISTIC:
X-squared: 33.3374
P VALUE:
Asymptotic p Value: 5.766e-08
Null hypothesis H0: The error term ui follows a normal distribution.
We have: p-value = 5,766.10-8 < = 0,05 Reject H0.
Thus, the error term ui is not normal distributed.
data: phandu1
W = 0.9838, p-value = 0.001114
Instruction: look at p-value to conclude.
data: phandu1
A = 0.58913, p-value = 0.1236
Instruction: look at p-value to conclude.
data: phandu1
D = 0.041139, p-value = 0.2066
Instruction: look at p-value to conclude.
Hypothesis:
log(area) = 0
rooms = 0