0% found this document useful (0 votes)
29 views4 pages

Catatan Regresi

This document contains the results of multiple linear regression models analyzing factors that influence house prices. Model 1 uses square footage (SQFT) as a predictor and explains 71.4% of price variation. Model 2 adds more data, increasing explanation to 80.8%. Subsequent models add a dummy variable (COR) to account for different intercepts and slopes based on other characteristics. The best model, with COR in both intercept and slope, explains 80.4% of price variation.

Uploaded by

HAB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
29 views4 pages

Catatan Regresi

This document contains the results of multiple linear regression models analyzing factors that influence house prices. Model 1 uses square footage (SQFT) as a predictor and explains 71.4% of price variation. Model 2 adds more data, increasing explanation to 80.8%. Subsequent models add a dummy variable (COR) to account for different intercepts and slopes based on other characteristics. The best model, with COR in both intercept and slope, explains 80.4% of price variation.

Uploaded by

HAB
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Model 1

PRICE = 47.8 + 0.614 SQFT


Predictor Coef SE Coef T P
Constant 47.82 62.85 0.76 0.448 (Tidak signifikan)
SQFT 0.61367 0.03625 16.93 0.000 (signifikan)
S = 204.451 R-Sq = 71.4% R-Sq(adj) = 71.1%
(Model baru bisa merepresentasikan total variasi data sebesar 71.4%)

Data dihapus
79 3750 1295.0 2349.1 78.3 -1054.1 -5.58RX
89 2116 2100.0 1346.3 25.3 753.7 3.71R

MODEL 2
PRICE = - 61.8 + 0.682 SQFT

Predictor Coef SE Coef T P


Constant -61.81 53.22 -1.16 0.248 (tidak signifikan)
SQFT 0.68246 0.03126 21.83 0.000 (signifikan)

S = 162.902 R-Sq = 80.8% R-Sq(adj) = 80.7%


(Model sudah dapat merepresentasikan total variasi data sebesar 80.8%)

Diagnostic Check
Residual Plots for PRICE
Normal Probability Plot Versus Fits
99.9 500
99

90
250
Residual
Percent

50 0

10 -250
1
-500
0.1
-500 -250 0 250 500 500 1000 1500 2000
Residual Fitted Value

Histogram Versus Order


20 500

15 250
Frequency

Residual

10 0

-250
5

-500
0
-450 -300 -150 0 150 300 1 10 20 30 40 50 60 70 80 90 100 110
Residual Observation Order

1. Normality test
- QQ-plot: sudah terlihat berdistribusi normal
- Histogram: sudah terlihat berdistribusi normal
- Digunakan Anderson-Darling normality test:
Ho: res ~ normal
H1: res !~ normal
Didapat
Mean = 0, Stdev = 162.2, N = 115, pvalue = 0.077
Gagal tolak Ho, berarti residual berdistribusi normal
2. Kesesuaian model (Identik)
- Residual vs Fit: Sudah acak – model linear sesuai
3. Independent
- Residual vs order
- Durbin-Watson: 1.46321
- ACF – Gambar dfaft ACF terlihat residual berkorelasi pada lag 1
- Residual tidak independen.
Model tidak memenuhi syarat IIDN.
Regresi dengan Dummy Variabel

A. Splitting Dummy Variabel

Cor = 1
The regression equation is
PRICE = 486 + 0.303 SQFT

Predictor Coef SE Coef T P


Constant 485.62 97.40 4.99 0.000
SQFT 0.30316 0.05358 5.66 0.000

S = 163.302 R-Sq = 61.5% R-Sq(adj) = 59.6%

Cor = 0
The regression equation is
PRICE = - 150 + 0.747 SQFT

Predictor Coef SE Coef T P


Constant -150.20 62.33 -2.41 0.018
SQFT 0.74673 0.03636 20.54 0.000

S = 172.388 R-Sq = 81.9% R-Sq(adj) = 81.7%

B. Dummy in Intercept

The regression equation is


PRICE = 63.0 + 0.617 SQFT - 110 COR

Predictor Coef SE Coef T P


Constant 63.01 62.04 1.02 0.312 (Not signifikan)
SQFT 0.61701 0.03560 17.33 0.000 (signifikan)
COR -110.24 47.52 -2.32 0.022 (Signifikan)

S = 200.663 R-Sq = 72.7% R-Sq(adj) = 72.2%

COR = 0
PRICE = 63.0 + 0.617 SQFT

COR = 1
PRICE = (63.0 -110.24) + 0.617 SQFT
PRICE = -47.24 + 0.617 SQFT

Scatterplot of FITS1 vs SQFT


2500

2000
FITS1

1500

1000

500
1000 1500 2000 2500 3000 3500 4000
SQFT
C. Dummy in Slope

PRICE = 20.6627 + 0.65115 SQFT - 0.109117 SQFT*COR

Coefficients

Term Coef SE Coef T P


Constant 20.6627 58.8371 0.3512 0.726
SQFT 0.6511 0.0348 18.6958 0.000
SQFT*COR -0.1091 0.0252 -4.3301 0.000

Summary of Model

S = 190.292 R-Sq = 75.41% R-Sq(adj) = 74.98%


PRESS = 4852605 R-Sq(pred) = 71.10%

COR = 0
PRICE = 20.6627 + 0.65115 SQFT

COR = 1
PRICE = 20.6627 + 0.65115 SQFT - 0.109117 SQFT*1
PRICE = 20.6627 + 0.542033 SQFT

Scatterplot of FITS2 vs SQFT


2500

2000
FITS2

1500

1000

500
1000 1500 2000 2500 3000 3500 4000
SQFT

D. Dummy in Intercept and Slope

Regression Equation

PRICE = -150.198 + 0.74673 SQFT + 635.821 COR - 0.443571 SQFT*COR

Coefficients

Term Coef SE Coef T P


Constant -150.198 61.761 -2.4319 0.017
SQFT 0.747 0.036 20.7250 0.000
COR 635.821 119.141 5.3367 0.000
SQFT*COR -0.444 0.067 -6.6575 0.000

Summary of Model
S = 170.815 R-Sq = 80.36% R-Sq(adj) = 79.84%
PRESS = 3855146 R-Sq(pred) = 77.04%

Analysis of Variance

Source DF Seq SS Adj SS Adj MS F P


Regression 3 13491870 13491870 4497290 154.135 0.000000
SQFT 1 11981915 12532593 12532593 429.527 0.000000
COR 1 216751 830994 830994 28.480 0.000000
SQFT*COR 1 1293204 1293204 1293204 44.322 0.000000
Error 113 3297077 3297077 29178
Lack-of-Fit 104 2913623 2913623 28016 0.658 0.849789
Pure Error 9 383455 383455 42606
Total 116 16788947

COR = 0
PRICE = -150.198 + 0.74673 SQFT

COR = 1
PRICE = -150.198 + 0.74673 SQFT + 635.821 COR - 0.443571 SQFT*COR
PRICE = 485.623 + 0.303159 SQFT

Scatterplot of FITS3 vs SQFT


2500

2000
FITS3

1500

1000

500

1000 1500 2000 2500 3000 3500 4000


SQFT

You might also like