0% found this document useful (0 votes)
3 views10 pages

Asst 3

The document outlines Assignment #3 for the PE562 course on Data Analytics in Petroleum Engineering, due on February 14, 2024. It includes a dataset and a series of tasks involving regression analysis, such as estimating parameters, creating an ANOVA table, testing statistical significance, and calculating explained variation. The document also provides MATLAB code snippets for performing the required calculations.

Uploaded by

nn1129374
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views10 pages

Asst 3

The document outlines Assignment #3 for the PE562 course on Data Analytics in Petroleum Engineering, due on February 14, 2024. It includes a dataset and a series of tasks involving regression analysis, such as estimating parameters, creating an ANOVA table, testing statistical significance, and calculating explained variation. The document also provides MATLAB code snippets for performing the required calculations.

Uploaded by

nn1129374
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

PE562 – Data Analytics in Petroleum Engineering

Fall 2023/2024 – Assignment #3

Due Date: Wednesday, Feb 14th, 2024

Name: Nedal Naser Al-fagi


ID: 3118239

Problem #1:

For the following dataset and by using (EXCEL or MATLAB Software):

Y X1 X2
6 1 8
8 4 2
1 9 -8
0 11 -10
5 3 6
3 8 -6
2 5 0
-4 10 -12
10 2 4
-3 7 -2
5 6 -4

a) Estimate the parameters in the model:

yi   0  1 x1i   2 x2i   i

b) Write out the analysis of variance table.


c) Using α = 0.05 test to determine if the overall regression is statistically significant.
d) What portion of the total variation about the mean is explained by the model?
e) Calculate the variance of each regression coefficient estimate.
f) How useful is the regression using X1 alone?
g) How useful is the regression using X2 alone?
h) What are your conclusions?

solution

1
(a):

[ ][]
1 1 8 6
1 4 2 8
1 9 −2 1
1 11 −10 0
1 3 6 5
x= 1 8 −6 y= 3
1 5 0 2
1 10 −12 −4
1 2 4 10
1 7 −2 −3
1 6 −4 5

[ ]
1 1 1 ¿1 1 ¿1 1 ¿1 1 ¿1 1
T
x = 1 4 9 ¿11 3 ¿ 8 5 ¿ 10 2 ¿7 6
8 2 −8 ¿−10 6 ¿−6 0 ¿−12 4 ¿−2 −4

[ ] [ ]
11 66 −22 4.3705 −0.8495 −0.4086
−1
(x ¿¿ T∗x)= 66 506 −346 ,(x ¿¿ T∗x) = −0.8495 0.1690 0.0822 ¿ ¿
−22 −346 484 −0.4086 0.0822 0.0422

^β=(x¿ ¿T ∗x)−1∗x T ∗y ¿

[ ]
14
^β= −2 ⟹ ^β =14 , ^β =−2 , β^ =−0.5 .
0 1 2
−0.5

^y i=14−2 x 1 i−0.5 x 2 i+ ε i

(b):

K=2 n=11

2
[ ]
33
T
x ∗y= 85 , ¿
142
^ T T
SS R= β ∗x ∗y−¿ ¿ ¿
SS E= y ∗y − ^β ∗x ∗y=289−221=68
T T T

SS R 122 SS E 68
MS R = = =61 MS E= = =8.5
k 2 n−k −1 8

MS R 61
F o= = =7.176
MS E 8.5

Sum of
Source Degrees of freedom Mean square Fo
Square
Regressio
2 122 61
n
7.176
Error 8 68 8.5
Total 10 190 //

(c):
 Calculate Fcri at (,1,2) : Fcri @(0.05,2,8) =4.459.
 Hypothesis test :
H0: 2=3=0 H1: j0 for at least one j

Fo  Fcri we reject the null hypothesis which 2=3=0, at least one of these variables
cannot be zero ,so the overall regression is significant .

(d):
2 SS R 122 2
R= = =0.6421 , R =% 64.21
SST 190

(e):

3
2 SS E 68
s =MS E= = =8. 5
n−k−1 8
∨ ( ^β 1) =¿ ¿ ¿
∨ ( ^β ) =¿ ¿ ¿
2

(f):

[ ][]
1 1 6
1 4 8
1 9 1
1 11 0
1 3 5
x= 1 ¿ 8 y= 3
1 5 2
1 10 −4
1 2 10
1 7 −3
1 6 5

T
x =
[ 1¿ 1 1 ¿1 1
9¿
¿1 ¿ ¿ ¿
11 ¿ 3 ¿ ¿ 1 1 8 ¿ 5 ¿ ¿1 1 10 ¿ 2 ¿ ¿ 1 1 7 ¿ 6 ¿
¿ ¿ ¿ ¿ ¿ ¿ ]
(x ¿¿ T∗x)=
[ 1166 66
506] [
−1
,(x ¿¿ T∗x ) =
0.4182 −0.0545
−0.0545 0.0091 ]
¿¿

^β=(x¿ ¿T ∗x)−1∗x T ∗y ¿

[
−1.0273 0 ]
^β= 9.1636 ⟹ β^ =9.1636 , ^β =−1.0273 .
1

^y i=14−1.0273 x1 i +ε i

 Anova Table :

K=1 n=11
T
x ∗y=
[ 3385] , ¿
SS R= β^ ∗x ∗y−¿ ¿ ¿
T T

4
SS E= y ∗y − ^β ∗x ∗y=289−215.0818=73.9182
T T T

SS R 116.0818 SS E 73.9182
MS R = = =116.0818 MS E = = =8.2131
k 1 n−k −1 9

MS R 116.0818
F o= = =14.1337
MS E 8.2131

Mean
Source Degrees of freedom Sum of Square Fo
square
Regressio
1 166.0818 116.0818
n
14.1337
Error 9 73.9182 8.2131
Total 10 190 //

Calculate Fcri at (,1,2) : Fcri @(0.05,1,9) =5.1174.

Hypothesis test :
H0: 2=3=0 H1: j0 for at least one j

so Fo  Fcri we reject the null hypothesis which 2=3=0, at least one of these variables
cannot be zero ,and we can say the overall regression is significant .

2 SS R 116.0818 2
R= = =0.611, R =% 61.1
SST 190

(g):

5
[ ][]
1 8 6
1 2 8
1 −2 1
1 −10 0
1 6 5
x= 1 ¿ −6 y= 3
1 0 2
1 −12 −4
1 4 10
1 −2 −3
1 −4 5

T
x =
[ 1¿ 1 1
¿8 ¿
−8 ¿ ¿ 1 1−10 ¿ 6 ¿ ¿ 1 1 −6 ¿ 0 ¿ ¿1 1−12 ¿ 4 ¿ ¿ 1 1 −2 ¿−4 ¿
¿ ¿ ¿ ¿ ¿ ¿ ¿ ¿ ]
(x ¿¿ T∗x)=
[−22
11 −22
484 ] −1
,(x¿ ¿ T∗x) =
[
0.1 0.0045
0.0045 0.0023
¿¿
]
^β=(x¿ ¿T ∗x)−1∗x T ∗y ¿

[0.4727 ]
^β= 3.9455 ⟹ ^β =3.9455 , β^ =0.4727 .
0 2

^y i=3.9455+0.4727 x 2 i+ ε i

 Anova Table :

K=1 n=11

T
x ∗y=
[142
33
], ¿
SS R= β^ ∗x ∗y−¿ ¿ ¿
T T

SS E= y ∗y − ^β ∗x ∗y=289−197.3273=91.6727
T T T

SS R 98.3273 SS E 91.6727
MS R = = =98.3273 MS E= = =10.18
k 1 n−k−1 9

6
MS R 98.3273
F o= = =9.6589
MS E 10.18

Degrees of Mean
Source Sum of Square Fo
freedom square
Regressio
1 98.3273 98.3273
n
9.6589
Error 9 91.7627 10.18
Total 10 190 //

Calculate Fcri at (,1,2) : Fcri @(0.05,1,9) =5.1174.

Hypothesis test :
H0: 2=3=0 H1: j0 for at least one j

so Fo  Fcri we reject the null hypothesis which 2=3=0, at least one of these variables
cannot be zero ,and we can say the overall regression is significant .

2 SS R 98.3273 2
R= = =0. 5175 , R =% 51.75
SST 190

(h):

7
Method solution using MATLAB

(a,b):

clc
clear
y=[6;8;1;0;5;3;2;-4;10;-3;5]; %matrix y
x=[1,1,8;1,4,2;1,9,-8;1,11,-10;1,3,6;1,8,-6;1,5,0;1,10,-
12;1,2,4;1,7,-2;1,6,-4];%matrix x
xT = x' % trans of matrix x
x1 = x'*x
x2 = inv(x'*x)
b = inv(x'*x)*x'*y
xT_y = x'*y
ssr=b'*x'*y
ssR=ssr-(33^2/11) %Sum of Square SSR
sse=y'*y
ssE =sse-ssr %Sum of Square SSE

Code is effective

8
(f):

clc
clear
y=[6;8;1;0;5;3;2;-4;10;-3;5]; %matrix y
x=[1,1;1,4;1,9;1,11;1,3;1,8;1,5;1,10;1,2;1,7;1,6];%matrix x
xT = x' % trans of matrix x
x1 = x'*x
x2 = inv(x'*x)
b = inv(x'*x)*x'*y
xT_y = x'*y
ssr=b'*x'*y
ssR=ssr-(33^2/11) %Sum of Square SSR
sse=y'*y
ssE =sse-ssr %Sum of Square SSE

Code is effective

(g):
clc
clear
y=[6;8;1;0;5;3;2;-4;10;-3;5]; %matrix y
x=[1,8;1,2;1,-8;1,-10;1,6;1,-6;1,0;1,-12;1,4;1,-2;1,-4];
%matrix x
xT = x' % trans of matrix x
x1 = x'*x
x2 = inv(x'*x)
b = inv(x'*x)*x'*y
xT_y = x'*y
ssr=b'*x'*y
ssR=ssr-(33^2/11) %Sum of Square SSR
sse=y'*y
ssE =sse-ssr %Sum of Square SSE

9
The End

10

You might also like