0% found this document useful (0 votes)
135 views

Linear Regression Assignment

The document discusses performing linear and multiple linear regression analyses to understand the relationship between an All Sky Surface UV Index and various meteorological variables. Temperature, earth skin temperature, specific humidity, and wind speed will be used as independent variables to predict the UV index value. Linear regression will first be done with one independent variable, then multiple linear regression will be performed using two or more independent variables.

Uploaded by

rizka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
135 views

Linear Regression Assignment

The document discusses performing linear and multiple linear regression analyses to understand the relationship between an All Sky Surface UV Index and various meteorological variables. Temperature, earth skin temperature, specific humidity, and wind speed will be used as independent variables to predict the UV index value. Linear regression will first be done with one independent variable, then multiple linear regression will be performed using two or more independent variables.

Uploaded by

rizka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Assignment 3: Regression Analysis

Delivered by
Name : Rizka Amelia Dwi Safira
NRP : 03311940000044
Subject : Applied Statistics and Probabilities
Find your own data that show a relationship between one dependent variable and more than
one independent variable.
1. Perform linear regression analysis between the dependent variable with one
independent variable. Show whether the regression equation is good enough using the
goodness of fit and residual analysis.
2. Perform multiple linear analyses between the dependent variable with two or more
independent variables. Show whether the regression equation is good enough using the
goodness of fit and residual analysis.
3. Analyze if it is true that more independent variables will give a better model.

DATA
The data that will be used for the regression analysis is taken from POWER Data Access
Viewer (DAV). This data is served via ArcGIS Online that supports the community to access
the meterology and solar related parameters data based for assessing and designing renewable
energy system. While, POWER itself, Prediction of Worldwide Energy Resource, is a project
by NASA that collaborates with three user communities whose serve solar and meteorological
data: 1) Renewable Energy (RE); 2) Sustainable Buildings (SB); and 3) Agroclimatology (AG).
The whole dataset can be accessed through: https://power.larc.nasa.gov/data-access-viewer/.

Here, the data from Renewable Energy (RE) community will be applied. Furthermore, the
detailed description follows:
• Temporal average of data : Daily
• Location of study : 40.4054°N and 3.6981°W (Madrid, Spain)
• Time extent : 01-April-2021 to 30-June-2021
Among the various data, the analysis will be processed to All Sky Surface UV Index that
depends upon four variables:
Table 1 Parameters for Dependent and Independent Variable

Type of
Parameter (s) Code Variable
Variable
Dependent ALLSKY_SFC_UV
All Sky Surface UV Index y
Variable _INDEX
Temperature at 2 meters (°C) T2M x1
Independent Earth Skin Temperature (°C) TS x2
Variable Specific Humidity at 2 meters (g/kg) QV2M x3
Wind Speed at 2 meters (m/s) WS2M x4
Table 2 Data for Linear Regression Analysis

y x1 x2 x3 x4
MO DY ALLSKY_SFC_
T2M TS QV2M WS2M
UV_INDEX
4 1 1.2 12.9 12.97 6.23 2.3
4 2 0.99 11.13 10.86 7.57 1.16
4 3 1.04 11.19 11.23 6.96 2.34
4 4 1.38 9.9 10.26 4.82 2.34
4 5 1.34 11.19 11.27 4.64 1.54
4 6 1.44 11.8 11.98 5.13 2.27
4 7 1.59 9.88 10.59 4.46 2.79
4 8 1.05 10.44 10.94 6.23 2.27
4 9 0.66 9.3 9.58 6.47 1.81
4 10 1.03 10.97 11.65 6.9 1.62
4 11 1.03 10.45 11.12 6.65 2.39
4 12 1.48 8.22 8.84 3.91 1.88
4 13 0.69 10.12 10.69 6.29 1.37
4 14 1.06 12.73 13.46 7.51 1.51
4 15 0.95 11.53 12.48 6.04 1.55
4 16 1.3 9.02 10.02 4.27 2.73
4 17 1.32 7.01 7.95 3.91 3.01
4 18 1.41 9.12 9.8 4.39 1.8
4 19 1.28 11.77 10.85 5.55 1.17
4 20 1.16 12.01 12.15 6.96 1.33
4 21 1.17 10.65 11.15 6.77 1.42
4 22 1.23 11.59 12.26 6.9 0.88
4 23 1.02 11.24 11.82 6.65 2.93
4 24 1.27 11.8 12.62 7.02 3.8
4 25 0.45 10.19 10.58 7.32 2.41
4 26 1.27 12.26 12.66 7.2 1.38
4 27 1.29 13.39 14.1 7.75 0.86
4 28 1.12 12.52 12.81 7.87 2.19
4 29 1.27 11.25 11.73 7.02 2.69
4 30 1.33 9.8 10.1 5.92 1.41
5 1 1.33 9.8 10.45 5.55 1.86
5 2 1.42 10.34 11.72 5.62 1.45
5 3 1.17 10.99 11.66 6.59 1.73
5 4 1.8 13.21 13.73 6.29 1.04
5 5 1.99 14.81 14.87 6.35 1.52
5 6 1.97 16.09 15.83 7.26 2.02
5 7 1.92 17.1 17.15 7.87 1.59
5 8 1.88 18.12 17.2 7.69 2.3
5 9 0.95 13.42 13.56 8 4.05
5 10 1.62 10.05 11.08 5.92 2.82
5 11 1.52 10.22 11.55 5.25 2.84
y x1 x2 x3 x4
MO DY ALLSKY_SFC_
T2M TS QV2M WS2M
UV_INDEX
5 12 1.75 11.97 12.7 6.29 3.57
5 13 1.36 12.83 13.72 6.77 3.14
5 14 1.98 14.36 15.39 6.53 1.95
5 15 1.86 17.2 17.48 8.42 2.82
5 16 1.85 17.55 18.31 9.09 3.03
5 17 2.49 15.69 16.94 6.35 1.78
5 18 2.37 17.34 17.94 7.51 2.62
5 19 2.32 16.08 17.51 6.35 2.58
5 20 2.3 18.46 19.73 7.81 1.48
5 21 2.26 19.87 20.32 8.06 2.54
5 22 1.76 13.5 14.41 5.62 2.76
5 23 1.33 13.63 15.19 6.16 1.61
5 24 1.51 14.16 15.55 6.59 2.3
5 25 1.94 16.16 18.19 6.65 1.83
5 26 1.84 17.67 19.38 7.81 1.48
5 27 1.57 18.88 19.75 9.03 1.62
5 28 1.71 18.9 20.03 9.83 1.3
5 29 2.05 21.18 22.83 9.46 1.16
5 30 1.92 22.52 23.54 10.01 1.45
5 31 1.92 22.7 23.73 10.68 1.99
6 1 1.86 19.12 20.16 9.64 2.55
6 2 2.1 19.09 19.96 7.87 1.6
6 3 2.12 20.1 21.26 8.18 2.34
6 4 2.32 19.87 21.75 7.69 2.12
6 5 1.04 16.04 16.22 9.4 1.91
6 6 2.46 20.63 22.05 8.3 0.81
6 7 2.5 22.94 24.48 8.3 1.48
6 8 2.55 23.99 25.02 8.24 1.62
6 9 2.41 23.87 25.31 8.24 2.21
6 10 2.49 24.16 25.51 8.54 1.7
6 11 2.23 23.93 25.86 8.97 2.45
6 12 2.34 23.15 24.8 9.46 2.75
6 13 2.27 24.62 25.9 8.91 2.78
6 14 2.13 25.3 27.51 9.4 1.7
6 15 1.92 24.98 26.41 9.52 3.28
6 16 1.89 23.85 25.08 10.31 2.31
6 17 1.19 17.28 17.73 10.01 2.02
6 18 1.56 18.69 19.2 9.52 2.18
6 19 1.87 18.51 19.8 6.77 2.31
6 20 1.42 16.73 17.85 8.18 4.01
6 21 1.93 16.47 18.02 7.39 2.9
6 22 1.34 17.03 18.94 7.81 1.91
y x1 x2 x3 x4
MO DY ALLSKY_SFC_
T2M TS QV2M WS2M
UV_INDEX
6 23 1.8 17.18 19.32 6.53 2.98
6 24 2.16 19.06 21.56 6.96 1.88
6 25 2.13 23 24.62 8.18 1.63
6 26 2.29 24.57 25.53 8.42 2.66
6 27 2.37 22.05 23.25 6.65 3.28
6 28 2.38 18.81 20.92 5.68 2.59
6 29 2.32 21.3 23.19 6.65 1.32
6 30 2.56 22.81 24.29 6.84 2.3

1. PERFORM LINEAR REGESSION ANALYSIS BETWEEN DEPENDENT


VARIABLE AND ONE INDEPENDENT VARIABLE
This section will perform the linear regression analysis between All Sky Surface UV Index
(y) as dependent variable and Temperature at 2 meters (x1 ) as independent variable.
Table 3 Computation Table for Performing Linear Regression (1 Dependent vs 1 Independent Variable)

x y ŷ= SSE SST SSR ɛ


ALLSKY_ 0.3932
No x2 xy +
T2M SFC_UV_ (y-ŷ)2 (y-ӯ)2 SST-SSE y-ŷ
INDEX 0.081x
1 12.9 1.2 166.410 15.480 1.439 0.057 0.226 0.169 -0.239
2 11.13 0.99 123.877 11.019 1.295 0.093 0.469 0.376 -0.305
3 11.19 1.04 125.216 11.638 1.300 0.068 0.403 0.336 -0.260
4 9.9 1.38 98.010 13.662 1.196 0.034 0.087 0.053 0.184
5 11.19 1.34 125.216 14.995 1.300 0.002 0.112 0.111 0.040
6 11.8 1.44 139.240 16.992 1.350 0.008 0.055 0.047 0.090
7 9.88 1.59 97.614 15.709 1.194 0.157 0.007 -0.150 0.396
8 10.44 1.05 108.994 10.962 1.239 0.036 0.391 0.355 -0.189
9 9.3 0.66 86.490 6.138 1.147 0.237 1.030 0.793 -0.487
10 10.97 1.03 120.341 11.299 1.282 0.064 0.416 0.352 -0.252
11 10.45 1.03 109.203 10.764 1.240 0.044 0.416 0.372 -0.210
12 8.22 1.48 67.568 12.166 1.059 0.177 0.038 -0.139 0.421
13 10.12 0.69 102.414 6.983 1.213 0.274 0.970 0.696 -0.523
14 12.73 1.06 162.053 13.494 1.425 0.133 0.378 0.245 -0.365
15 11.53 0.95 132.941 10.954 1.328 0.143 0.526 0.383 -0.378
16 9.02 1.3 81.360 11.726 1.124 0.031 0.141 0.110 0.176
17 7.01 1.32 49.140 9.253 0.961 0.129 0.126 -0.003 0.359
18 9.12 1.41 83.174 12.859 1.132 0.077 0.070 -0.007 0.278
19 11.77 1.28 138.533 15.066 1.347 0.005 0.156 0.152 -0.067
20 12.01 1.16 144.240 13.932 1.367 0.043 0.265 0.223 -0.207
21 10.65 1.17 113.423 12.461 1.256 0.007 0.255 0.248 -0.086
22 11.59 1.23 134.328 14.256 1.333 0.011 0.198 0.188 -0.103
23 11.24 1.02 126.338 11.465 1.304 0.081 0.429 0.348 -0.284
24 11.8 1.27 139.240 14.986 1.350 0.006 0.164 0.158 -0.080
x y ŷ= SSE SST SSR ɛ
ALLSKY_ 0.3932
No x2 xy +
T2M SFC_UV_ (y-ŷ)2 (y-ӯ)2 SST-SSE y-ŷ
INDEX 0.081x
25 10.19 0.45 103.836 4.586 1.219 0.591 1.501 0.909 -0.769
26 12.26 1.27 150.308 15.570 1.387 0.014 0.164 0.150 -0.117
27 13.39 1.29 179.292 17.273 1.478 0.035 0.148 0.113 -0.188
28 12.52 1.12 156.750 14.022 1.408 0.083 0.308 0.225 -0.288
29 11.25 1.27 126.563 14.288 1.305 0.001 0.164 0.163 -0.035
30 9.8 1.33 96.040 13.034 1.187 0.020 0.119 0.099 0.143
31 9.8 1.33 96.040 13.034 1.187 0.020 0.119 0.099 0.143
32 10.34 1.42 106.916 14.683 1.231 0.036 0.065 0.029 0.189
33 10.99 1.17 120.780 12.858 1.284 0.013 0.255 0.242 -0.114
34 13.21 1.8 174.504 23.778 1.464 0.113 0.016 -0.097 0.336
35 14.81 1.99 219.336 29.472 1.593 0.157 0.099 -0.058 0.397
36 16.09 1.97 258.888 31.697 1.697 0.074 0.087 0.013 0.273
37 17.1 1.92 292.410 32.832 1.779 0.020 0.060 0.040 0.141
38 18.12 1.88 328.334 34.066 1.862 0.000 0.042 0.042 0.018
39 13.42 0.95 180.096 12.749 1.481 0.282 0.526 0.244 -0.531
40 10.05 1.62 101.003 16.281 1.208 0.170 0.003 -0.167 0.412
41 10.22 1.52 104.448 15.534 1.221 0.089 0.024 -0.065 0.299
42 11.97 1.75 143.281 20.948 1.363 0.150 0.006 -0.144 0.387
43 12.83 1.36 164.609 17.449 1.433 0.005 0.099 0.094 -0.073
44 14.36 1.98 206.210 28.433 1.557 0.179 0.093 -0.086 0.423
45 17.2 1.86 295.840 31.992 1.787 0.005 0.034 0.029 0.073
46 17.55 1.85 308.003 32.468 1.816 0.001 0.031 0.029 0.034
47 15.69 2.49 246.176 39.068 1.665 0.681 0.664 -0.017 0.825
48 17.34 2.37 300.676 41.096 1.799 0.327 0.483 0.156 0.571
49 16.08 2.32 258.566 37.306 1.696 0.389 0.416 0.027 0.624
50 18.46 2.3 340.772 42.458 1.889 0.169 0.391 0.222 0.411
51 19.87 2.26 394.817 44.906 2.004 0.066 0.342 0.276 0.256
52 13.5 1.76 182.250 23.760 1.487 0.074 0.007 -0.067 0.273
53 13.63 1.33 185.777 18.128 1.498 0.028 0.119 0.091 -0.168
54 14.16 1.51 200.506 21.382 1.541 0.001 0.027 0.026 -0.031
55 16.16 1.94 261.146 31.350 1.703 0.056 0.070 0.014 0.237
56 17.67 1.84 312.229 32.513 1.825 0.000 0.027 0.027 0.015
57 18.88 1.57 356.454 29.642 1.923 0.125 0.011 -0.114 -0.353
58 18.9 1.71 357.210 32.319 1.925 0.046 0.001 -0.045 -0.215
59 21.18 2.05 448.592 43.419 2.110 0.004 0.141 0.137 -0.060
60 22.52 1.92 507.150 43.238 2.218 0.089 0.060 -0.029 -0.298
61 22.7 1.92 515.290 43.584 2.233 0.098 0.060 -0.038 -0.313
62 19.12 1.86 365.574 35.563 1.943 0.007 0.034 0.027 -0.083
63 19.09 2.1 364.428 40.089 1.940 0.025 0.181 0.155 0.160
64 20.1 2.12 404.010 42.612 2.022 0.010 0.198 0.188 0.098
65 19.87 2.32 394.817 46.098 2.004 0.100 0.416 0.316 0.316
x y ŷ= SSE SST SSR ɛ
ALLSKY_ 0.3932
No x2 xy +
T2M SFC_UV_ (y-ŷ)2 (y-ӯ)2 SST-SSE y-ŷ
INDEX 0.081x
66 16.04 1.04 257.282 16.682 1.693 0.427 0.403 -0.023 -0.653
67 20.63 2.46 425.597 50.750 2.065 0.156 0.616 0.460 0.395
68 22.94 2.5 526.244 57.350 2.252 0.061 0.681 0.619 0.248
69 23.99 2.55 575.520 61.175 2.337 0.045 0.766 0.720 0.213
70 23.87 2.41 569.777 57.527 2.328 0.007 0.540 0.533 0.082
71 24.16 2.49 583.706 60.158 2.351 0.019 0.664 0.645 0.139
72 23.93 2.23 572.645 53.364 2.333 0.011 0.308 0.297 -0.103
73 23.15 2.34 535.923 54.171 2.269 0.005 0.442 0.437 0.071
74 24.62 2.27 606.144 55.887 2.389 0.014 0.354 0.340 -0.119
75 25.3 2.13 640.090 53.889 2.444 0.098 0.207 0.109 -0.314
76 24.98 1.92 624.000 47.962 2.418 0.248 0.060 -0.188 -0.498
77 23.85 1.89 568.823 45.077 2.326 0.190 0.046 -0.144 -0.436
78 17.28 1.19 298.598 20.563 1.794 0.364 0.235 -0.129 -0.604
79 18.69 1.56 349.316 29.156 1.908 0.121 0.013 -0.108 -0.348
80 18.51 1.87 342.620 34.614 1.893 0.001 0.038 0.037 -0.023
81 16.73 1.42 279.893 23.757 1.749 0.108 0.065 -0.043 -0.329
82 16.47 1.93 271.261 31.787 1.728 0.041 0.065 0.024 0.202
83 17.03 1.34 290.021 22.820 1.773 0.188 0.112 -0.076 -0.433
84 17.18 1.8 295.152 30.924 1.786 0.000 0.016 0.015 0.014
85 19.06 2.16 363.284 41.170 1.938 0.049 0.235 0.186 0.222
86 23 2.13 529.000 48.990 2.257 0.016 0.207 0.191 -0.127
87 24.57 2.29 603.685 56.265 2.384 0.009 0.378 0.369 -0.094
88 22.05 2.37 486.203 52.259 2.180 0.036 0.483 0.447 0.190
89 18.81 2.38 353.816 44.768 1.918 0.214 0.497 0.283 0.462
90 21.3 2.32 453.690 49.416 2.119 0.040 0.416 0.376 0.201
91 22.81 2.56 520.296 58.394 2.242 0.101 0.783 0.682 0.318
SUM 1439.300 152.430 25007.874 2592.704 8.839 23.572 14.733
AVG 15.816 1.675
Computation:
• Covariance:
1 (∑ 𝑥 ∑ 𝑦) 1 1439.300 × 152.430
𝑠𝑥𝑦 = ( ) (∑ − ) = ( ) (2592.704 − ) = 2.01997
𝑛−1 𝑛 90 91
𝑥𝑦

• x-variance:
1 2 (∑ 𝑥 )2 1 (1439.300)2
𝑠𝑥2 = ( )(∑ − ) = ( ) (25007.874 − ) = 24.92455
𝑛−1 𝑥 𝑛 90 91
• Determining regression equation:
𝑠𝑥𝑦 2.01997
𝑏1 = 2 = = 0.08104
𝑠𝑥 24.92455
𝑏0 = 𝑦𝑎𝑣𝑔 − 𝑏1 𝑥 𝑎𝑣𝑔 = 1.675 − (0.08104 × 15.816) = 0.39323
So, the regression equation is: ŷ = 𝟎. 𝟑𝟗𝟑𝟐𝟑 + 𝟎. 𝟎𝟖𝟏𝟎𝟒𝐱
BY USING GOODNESS OF FIT
In Goodness of Fit, we evaluate the regression relationship by the ratio of SSR/SST, which is
called as the coefficient of determination (𝑟2 ):
𝑆𝑆𝑅 14.733
𝑟2 = = = 0.62504
𝑆𝑆𝑇 23.572
Hence, the coefficient determination in this analysis is 0.62504 or 62.504%. This indicates
that the regression model has accounted for 62.504% of the variability of the data.

Figure 1 Regression Line Obtained Using Excel (Left) and R Program (Right)

Here’s the script for creating Linear Regression Analysis using R:


x <- c(12.9,11.13,11.19,9.9,11.19,11.8,9.88,10.44,9.3,10.97,...,21.3,22.81)
y <-
c(1.2,0.99,1.04,1.38,1.34,1.44,1.59,1.05,0.66,1.03,1.03,1.48,...,2.32,2.56)
relation <- lm(y~x)
print(summary(relation))

# Give the chart file a name.


png(file = "linearregression _uvindex_vs_temp2m.png")

# Plot the chart.


plot(x,y,col = "blue",main = "All Sky Surface UV Index and Temperature at 2
meters Regression", abline(lm(y~x)),cex = 1.3,pch = 16,xlab = "Temperature at 2
meters in Celcius",ylab = "UV Index (dimensionless)")

# Save the file.


dev.off()

Result:
Call:
lm(formula = y ~ x)

Residuals:
Min 1Q Median 3Q Max
-0.76907 -0.22682 0.01444 0.22959 0.82520
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.393233 0.110301 3.565 0.000588 ***
x 0.081043 0.006654 12.180 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3151 on 89 degrees of freedom


Multiple R-squared: 0.625, Adjusted R-squared: 0.6208
F-statistic: 148.4 on 1 and 89 DF, p -value: < 2.2e-16

BY USING RESIDUAL ANALYSIS


In residual anlysis, the plot is used to verify the regression model by forming the residuals (ɛ)
vs predicted value of dependent variable (ŷ). The residuals of each measurement have been
computed in Table 3.

Figure 2 Residual Analysis Plot Obtained Using Excel (Left) and R Program (Right)

Here’s the script for creating Residual Analysis using R:


# Get list of residuals
res <- resid(relation)

# Give the chart file a name.


png(file = "res_uvindex_vs_temp2m.png")

# produce residual vs. fitted plot


plot(fitted(relation), res,col = "blue",main = "Residual Analysis Plot",xlab =
"Fitted Model",ylab = "Residual")

# add a horizontal line at 0


abline(0,0)

# Save the file.


dev.off()
2. PERFORM LINEAR REGESSION ANALYSIS BETWEEN DEPENDENT
VARIABLE AND FOUR INDEPENDENT VARIABLE
This section will perform the linear regression analysis between All Sky Surface UV Index
(y) as dependent variable and Temperature at 2 meters (x1 ), Earth Skin Temperature (x2 ),
Specific Humidity at 2 meters (x3 ), and Wind Speed at 2 meters (x4 ) as four independent
variables.
Table 4 Computation Table for Performing Linear Regression (1 Dependent vs 4 Independent Variables)

y x1 x2 x3 x4 SSE SST SSR ɛ


NO ALLSKY_ ŷ SST-
SFC_UV_ T2M TS QV2M WS2M (y-ŷ)2 (y-ӯ)2 y-ŷ
SSE
INDEX
1 1.2 12.9 12.97 6.23 2.3 1.503 0.092 0.226 0.134 -0.303
2 0.99 11.13 10.86 7.57 1.16 1.008 0.000 0.469 0.469 -0.018
3 1.04 11.19 11.23 6.96 2.34 1.133 0.009 0.403 0.395 -0.093
4 1.38 9.9 10.26 4.82 2.34 1.417 0.001 0.087 0.086 -0.037
5 1.34 11.19 11.27 4.64 1.54 1.625 0.081 0.112 0.031 -0.285
6 1.44 11.8 11.98 5.13 2.27 1.594 0.024 0.055 0.032 -0.154
7 1.59 9.88 10.59 4.46 2.79 1.488 0.010 0.007 -0.003 0.102
8 1.05 10.44 10.94 6.23 2.27 1.195 0.021 0.391 0.370 -0.145
9 0.66 9.3 9.58 6.47 1.81 1.002 0.117 1.030 0.913 -0.342
10 1.03 10.97 11.65 6.9 1.62 1.132 0.010 0.416 0.406 -0.102
11 1.03 10.45 11.12 6.65 2.39 1.109 0.006 0.416 0.410 -0.079
12 1.48 8.22 8.84 3.91 1.88 1.400 0.006 0.038 0.032 0.080
13 0.69 10.12 10.69 6.29 1.37 1.152 0.214 0.970 0.757 -0.462
14 1.06 12.73 13.46 7.51 1.51 1.231 0.029 0.378 0.349 -0.171
15 0.95 11.53 12.48 6.04 1.55 1.386 0.190 0.526 0.335 -0.436
16 1.3 9.02 10.02 4.27 2.73 1.422 0.015 0.141 0.126 -0.122
17 1.32 7.01 7.95 3.91 3.01 1.236 0.007 0.126 0.119 0.084
18 1.41 9.12 9.8 4.39 1.8 1.417 0.000 0.070 0.070 -0.007
19 1.28 11.77 10.85 5.55 1.17 1.503 0.050 0.156 0.106 -0.223
20 1.16 12.01 12.15 6.96 1.33 1.250 0.008 0.265 0.257 -0.090
21 1.17 10.65 11.15 6.77 1.42 1.119 0.003 0.255 0.252 0.051
22 1.23 11.59 12.26 6.9 0.88 1.219 0.000 0.198 0.198 0.011
23 1.02 11.24 11.82 6.65 2.93 1.203 0.033 0.429 0.396 -0.183
24 1.27 11.8 12.62 7.02 3.8 1.190 0.006 0.164 0.158 0.080
25 0.45 10.19 10.58 7.32 2.41 0.933 0.233 1.501 1.267 -0.483
26 1.27 12.26 12.66 7.2 1.38 1.234 0.001 0.164 0.163 0.036
27 1.29 13.39 14.1 7.75 0.86 1.273 0.000 0.148 0.148 0.017
28 1.12 12.52 12.81 7.87 2.19 1.117 0.000 0.308 0.308 0.003
29 1.27 11.25 11.73 7.02 2.69 1.128 0.020 0.164 0.144 0.142
30 1.33 9.8 10.1 5.92 1.41 1.185 0.021 0.119 0.098 0.145
31 1.33 9.8 10.45 5.55 1.86 1.261 0.005 0.119 0.114 0.069
32 1.42 10.34 11.72 5.62 1.45 1.328 0.009 0.065 0.057 0.092
33 1.17 10.99 11.66 6.59 1.73 1.198 0.001 0.255 0.254 -0.028
34 1.8 13.21 13.73 6.29 1.04 1.550 0.063 0.016 -0.047 0.250
y x1 x2 x3 x4 SSE SST SSR ɛ
NO ALLSKY_ ŷ SST-
SFC_UV_ T2M TS QV2M WS2M (y-ŷ)2 (y-ӯ)2 y-ŷ
SSE
INDEX
35 1.99 14.81 14.87 6.35 1.52 1.731 0.067 0.099 0.032 0.259
36 1.97 16.09 15.83 7.26 2.02 1.695 0.075 0.087 0.012 0.275
37 1.92 17.1 17.15 7.87 1.59 1.706 0.046 0.060 0.014 0.214
38 1.88 18.12 17.2 7.69 2.3 1.855 0.001 0.042 0.041 0.025
39 0.95 13.42 13.56 8 4.05 1.182 0.054 0.526 0.472 -0.232
40 1.62 10.05 11.08 5.92 2.82 1.209 0.169 0.003 -0.166 0.411
41 1.52 10.22 11.55 5.25 2.84 1.373 0.022 0.024 0.002 0.147
42 1.75 11.97 12.7 6.29 3.57 1.365 0.148 0.006 -0.143 0.385
43 1.36 12.83 13.72 6.77 3.14 1.381 0.000 0.099 0.099 -0.021
44 1.98 14.36 15.39 6.53 1.95 1.641 0.115 0.093 -0.022 0.339
45 1.86 17.2 17.48 8.42 2.82 1.592 0.072 0.034 -0.037 0.268
46 1.85 17.55 18.31 9.09 3.03 1.500 0.122 0.031 -0.092 0.350
47 2.49 15.69 16.94 6.35 1.78 1.853 0.406 0.664 0.258 0.637
48 2.37 17.34 17.94 7.51 2.62 1.805 0.319 0.483 0.164 0.565
49 2.32 16.08 17.51 6.35 2.58 1.895 0.180 0.416 0.236 0.425
50 2.3 18.46 19.73 7.81 1.48 1.906 0.155 0.391 0.235 0.394
51 2.26 19.87 20.32 8.06 2.54 2.013 0.061 0.342 0.281 0.247
52 1.76 13.5 14.41 5.62 2.76 1.711 0.002 0.007 0.005 0.049
53 1.33 13.63 15.19 6.16 1.61 1.635 0.093 0.119 0.026 -0.305
54 1.51 14.16 15.55 6.59 2.3 1.603 0.009 0.027 0.019 -0.093
55 1.94 16.16 18.19 6.65 1.83 1.858 0.007 0.070 0.063 0.082
56 1.84 17.67 19.38 7.81 1.48 1.810 0.001 0.027 0.026 0.030
57 1.57 18.88 19.75 9.03 1.62 1.699 0.017 0.011 -0.006 -0.129
58 1.71 18.9 20.03 9.83 1.3 1.542 0.028 0.001 -0.027 0.168
59 2.05 21.18 22.83 9.46 1.16 1.917 0.018 0.141 0.123 0.133
60 1.92 22.52 23.54 10.01 1.45 1.963 0.002 0.060 0.058 -0.043
61 1.92 22.7 23.73 10.68 1.99 1.841 0.006 0.060 0.054 0.079
62 1.86 19.12 20.16 9.64 2.55 1.594 0.071 0.034 -0.036 0.266
63 2.1 19.09 19.96 7.87 1.6 1.968 0.017 0.181 0.163 0.132
64 2.12 20.1 21.26 8.18 2.34 2.027 0.009 0.198 0.189 0.093
65 2.32 19.87 21.75 7.69 2.12 2.110 0.044 0.416 0.372 0.210
66 1.04 16.04 16.22 9.4 1.91 1.250 0.044 0.403 0.359 -0.210
67 2.46 20.63 22.05 8.3 0.81 2.090 0.137 0.616 0.479 0.370
68 2.5 22.94 24.48 8.3 1.48 2.378 0.015 0.681 0.666 0.122
69 2.55 23.99 25.02 8.24 1.62 2.517 0.001 0.766 0.764 0.033
70 2.41 23.87 25.31 8.24 2.21 2.500 0.008 0.540 0.532 -0.090
71 2.49 24.16 25.51 8.54 1.7 2.479 0.000 0.664 0.664 0.011
72 2.23 23.93 25.86 8.97 2.45 2.358 0.016 0.308 0.292 -0.128
73 2.34 23.15 24.8 9.46 2.75 2.150 0.036 0.442 0.406 0.190
74 2.27 24.62 25.9 8.91 2.78 2.448 0.032 0.354 0.322 -0.178
75 2.13 25.3 27.51 9.4 1.7 2.455 0.105 0.207 0.102 -0.325
y x1 x2 x3 x4 SSE SST SSR ɛ
NO ALLSKY_ ŷ SST-
SFC_UV_ T2M TS QV2M WS2M (y-ŷ)2 (y-ӯ)2 y-ŷ
SSE
INDEX
76 1.92 24.98 26.41 9.52 3.28 2.363 0.196 0.060 -0.136 -0.443
77 1.89 23.85 25.08 10.31 2.31 2.063 0.030 0.046 0.016 -0.173
78 1.19 17.28 17.73 10.01 2.02 1.282 0.009 0.235 0.227 -0.092
79 1.56 18.69 19.2 9.52 2.18 1.563 0.000 0.013 0.013 -0.003
80 1.87 18.51 19.8 6.77 2.31 2.119 0.062 0.038 -0.024 -0.249
81 1.42 16.73 17.85 8.18 4.01 1.578 0.025 0.065 0.040 -0.158
82 1.93 16.47 18.02 7.39 2.9 1.726 0.042 0.065 0.023 0.204
83 1.34 17.03 18.94 7.81 1.91 1.725 0.148 0.112 -0.036 -0.385
84 1.8 17.18 19.32 6.53 2.98 2.001 0.040 0.016 -0.025 -0.201
85 2.16 19.06 21.56 6.96 1.88 2.168 0.000 0.235 0.235 -0.008
86 2.13 23 24.62 8.18 1.63 2.410 0.078 0.207 0.129 -0.280
87 2.29 24.57 25.53 8.42 2.66 2.541 0.063 0.378 0.315 -0.251
88 2.37 22.05 23.25 6.65 3.28 2.584 0.046 0.483 0.437 -0.214
89 2.38 18.81 20.92 5.68 2.59 2.390 0.000 0.497 0.497 -0.010
90 2.32 21.3 23.19 6.65 1.32 2.518 0.039 0.416 0.377 -0.198
91 2.56 22.81 24.29 6.84 2.3 2.655 0.009 0.783 0.774 -0.095
SUM 152.430 1439.300 1521.460 659.280 192.960 152.430 4.805 23.572 18.767
AVG 1.675
y x1 x2 x3 x4 SSE SST SSR ɛ
Computation:
By using the matrix method, the normal equation may be written as a product of a known 5 x
5 matrix (N) with an unknown 5 x 1 matrix (b), to give a known 5 x 1 matrix (c):
𝑁𝑏 = 𝑐
∑ 𝑥1 ∑ 𝑥2 ∑ 𝑥3 ∑ 𝑥4 𝑏0 ∑𝑦
𝑛
∑ 𝑥1 ∑𝑥12 ∑ 𝑥1 𝑥2 ∑ 𝑥1 𝑥3 ∑ 𝑥 1 𝑥 4 𝑏1 ∑ 𝑥1 𝑦

∑ 𝑥2 ∑ 𝑥 1 𝑥2 ∑ 𝑥22 ∑ 𝑥 2𝑥3 ∑ 𝑥 2𝑥4 𝑏2 = ∑ 𝑥2 𝑦

∑ 𝑥3 ∑ 𝑥1 𝑥3 ∑ 𝑥 2𝑥3 ∑ 𝑥32 ∑ 𝑥 3𝑥4 𝑏3 ∑ 𝑥3 𝑦

∑ 𝑥4 ∑ 𝑥1 𝑥4 ∑ 𝑥 2𝑥4 ∑ 𝑥 3𝑥4 ∑ 𝑥42 𝑏4 ∑ 𝑥4 𝑦


Hence:
𝑏 = 𝑁−1 𝑐
By using Exce computing, it gets:
𝒃𝟎 1.18056
𝒃𝟏 0.11684
𝒃𝟐 = 0.01069
𝒃𝟑 -0.20817
𝒃𝟒 -0.01140
So, the multiple linear regression equation would be:
ŷ = 𝟏. 𝟏𝟖𝟎𝟓𝟔 + 𝟎. 𝟏𝟏𝟔𝟖𝟒𝐱 𝟏 + 𝟎. 𝟎𝟏𝟎𝟔𝟗𝒙𝟐 + ( −𝟎. 𝟐𝟎𝟖𝟏𝟕) 𝒙𝟑 + (−𝟎. 𝟎𝟏𝟏𝟒𝟎)𝒙𝟒
BY USING GOODNESS OF FIT
In Goodness of Fit, we evaluate the regression relationship by the ratio of SSR/SST, which is
called as the coefficient of determination (𝑅 2 ):
𝑆𝑆𝑅 18.767
𝑅2 = = = 0.79614
𝑆𝑆𝑇 23.572
Hence, the coefficient determination in this analysis is 0.79614 or 79.614%. This indicates
that the regression model has accounted for 79.614% of the variability of the data.

Without computing manually, Excel and R Program already have the feature to determine the
value of b1 , b2 ,…, bn . Here is shown the result of both Excel and R Program in determining y-
intercept, each slope of every parameter, completed with standard error, t-stat, and P-value.

Result of Data Analysis Using Excel


Regression Statistics
Multiple R 0.892266
R Square 0.796138
Adjusted R Square 0.786656
Standard Error 0.236384
Observations 91
df SS MS F Significance F
Regression 4 18.76663 4.691658 83.96368 7.05E-29
Residual 86 4.805442 0.055877
Total 90 23.57207
Coefficients Standard Error t Stat P-value
Intercept 1.18056 0.148183 7.966892 6.17E-12
X Variable 1 0.116844 0.052511 2.225123 0.02869
X Variable 2 0.010693 0.046837 0.22831 0.819947
X Variable 3 -0.20817 0.026287 -7.91923 7.7E-12
X Variable 4 -0.0114 0.035445 -0.32151 0.748603
The result of R Square, Intercept, X for Variables 1-4 are matched with the manual
computing.

Result of Data Analysis Using R Program


The script:
x1 <- c(12.9,11.13,11.19,9.9,11.19,11.8,9.88,10.44,9.3,10.97,… ,21.3,22.81)
x2 <- c(12.97,10.86,11.23,10.26,11.27,11.98,10.59,10.94,9.58,… ,23.19,24.29)
x3 <- c(6.23,7.57,6.96,4.82,4.64,5.13,4.46,6.23,6.47,6.9,6.65 ,… ,6.65,6.84)
x4 <- c(2.3,1.16,2.34,2.34,1.54,2.27,2.79,2.27,1.81,1.62,2.39,… ,1.32,2.3)
y <- c(1.2,0.99,1.04,1.38,1.34,1.44,1.59,1.05,0.66,1.03,1.03,… ,2.32,2.56)

# Create the relationship model


model <- lm(y~x1+x2+x3+x4)
# Show the model
print(model)
print(summary(model))

# Get the Intercept and coefficients as vector elements.


cat("# # # # The Coefficient Values # # # ","\n")

b0 <- coef(model)[1]
print(b0)

b1 <- coef(model)[2]
b2 <- coef(model)[3]
b3 <- coef(model)[4]
b4 <- coef(model) [5]

print(b1)
print(b2)
print(b3)
print(b4)

plot(model)

The Result:
Call:
lm(formula = y ~ x1 + x2 + x3 + x4)

Coefficients:
(Intercept) x1 x2 x3 x4
1.18056 0.11684 0.01069 -0.20817 -0.01140

Call:
lm(formula = y ~ x1 + x2 + x3 + x4)

Residuals:
Min 1Q Median 3Q Max
-0.48304 -0.16450 -0.00303 0.13733 0.63720

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.18056 0.14818 7.967 6.17e-12 ***
x1 0.11684 0.05251 2.225 0.0287 *
x2 0.01069 0.04684 0.228 0.8199
x3 -0.20817 0.02629 -7.919 7.70e-12 ***
x4 -0.01140 0.03545 -0.322 0.7486
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.2364 on 86 degrees of freedo m


Multiple R-squared: 0.7961, Adjusted R-squared: 0.7867
F-statistic: 83.96 on 4 and 86 DF, p -value: < 2.2e-16

>
> # Get the Intercept and coefficients as vector elements.
> cat("# # # # The Coefficient Values # # # ","\n")
# # # # The Coefficient Values # # #
>
> b0 <- coef(model)[1]
> # Get the Intercept and coefficients as vector elements.
> cat("# # # # The Coefficient Values # # # ","\n")
# # # # The Coefficient Values # # #
>
> b0 <- coef(model)[1]
> print(b0)
(Intercept)
1.18056
>
> print(b1)
x1
0.116844
> print(b2)
x2
0.01069345
> print(b3)
x3
-0.2081744
> print(b4)
x4
-0.01139604

The result of R Square, Intercept, X for Variables 1-4 are matched with the manual
computing.

BY USING RESIDUAL ANALYSIS


In residual anlysis, the plot is used to verify the regression model by forming the residuals (ɛ)
vs predicted value of dependent variable (ŷ). The residuals of each measurement have been
computed in Table 4.

Figure 3 Residual Analysis Plot for Multiple Regression Obtained Using Excel (Left) and R Program (Right)
3. ANALYSIS WHETHER THE MORE INDEPENDENT VARIABLES THE
BETTER THE MODEL
Better models are shown by the higher value of coefficient of determination (R2 ) and more
constant value of residual (ɛ). Here is the summary of each value from the simple linear
regression and multiple linear regression.
Table 5 Comparison the Analysis Result Between Simple and Multiple Linear Regression

Type of R2 from
Residual Analysis Plot
Regression Goodness of Fit

Simple Linear
Regression
(1 Dependent
0.62504
Variable vs 1
Independent
Variable)

Multiple Linear
Regression (1
Dependent
0.79614
Variable vs 4
Independent
Variable)

Based on the summary table above, it can be concluded that more independent variables will
give a better model. It is mathematically shown by the increasing value of coefficient of
determination (R2 ) in 4 Independent Variables model, also graphically shown by more constant
value of residual in 4 Independent Variables model (apporved by more narrow the residual
spreading through the graphic).

Hence, it is more realistic to compute an adjusted value for R 2 to avoid overestimating the
𝑛−1
impact of adding extra variables, by formula: 𝑅𝑎2 = 1 − (1 − 𝑅 2 ) ( 𝑛−𝑝−1) where n is sample size

and p is number of independent variables. Hence, we get:


• Adjusted R2 of Simple Linear Regression = 0.6208
• Adjusted R2 of Multiple Linear Regression = 0.7867
Moreover, choosing the essential multi variables is also important in determining whether each
independent variable has the correlation with dependent variable.

You might also like