0% found this document useful (0 votes)
89 views17 pages

Regression Analysis With Excel: 1. How To Instal Statistical Pakage From Excel

The document discusses how to perform regression analysis in Excel. It includes: 1) Installing the analysis toolpak add-in to access regression functions. 2) Examples of using functions like LN and GEOMEAN. 3) Using the data analysis tool to calculate descriptive statistics and determine parameters like means and confidence intervals. 4) Examples of simple, multiple and logarithmic (log-linear) regression analyses in Excel including viewing outputs, coefficients, and comparing models using RSS.

Uploaded by

Moni Lacatus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views17 pages

Regression Analysis With Excel: 1. How To Instal Statistical Pakage From Excel

The document discusses how to perform regression analysis in Excel. It includes: 1) Installing the analysis toolpak add-in to access regression functions. 2) Examples of using functions like LN and GEOMEAN. 3) Using the data analysis tool to calculate descriptive statistics and determine parameters like means and confidence intervals. 4) Examples of simple, multiple and logarithmic (log-linear) regression analyses in Excel including viewing outputs, coefficients, and comparing models using RSS.

Uploaded by

Moni Lacatus
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Apendix 1 – Excel

Regression Analysis with Excel

1. How to instal statistical pakage from Excel

 Open Excel;
 From File select Options;

 Select Add-Ins and then Analysis ToolPak →OK


1
Apendix 1 – Excel

 Select Analysis ToolPack and Analysis ToolPack VBA → Ok

Now you have instaled the statistical pakage from Excel and you can find it in Data→Data
Analysis

2
Apendix 1 – Excel

2. Using some functions like logarithm and geometric mean

For logarithm we have to use ”ln(number)”:

For geometric mean we have to use ”geomean(number1, number2,...)”:

3
Apendix 1 – Excel

3. How to determine some parameters

 Data → Data Analysis → Descriptive statistics

o Input Range – B1:D13;


o Labels in the first row (because we have selected the name of each variable);
o Output Range - F1 (the cell from which Excel will display the results);
o Choose an analysis: Summary statistics (for parameters), Confidence levele for
mean (confidence intervals for mean).

Sosiri PIB ISD

Mean 5035795.417 Mean 7409.58 Mean 269.5


Standard Error 204560.0024 Standard Error 737.476 Standard Error 46.4152
Median 4968809 Median 8248.5 Median 276.5
Mode #N/A Mode #N/A Mode #N/A
Standard Deviation 708616.6347 Standard Deviation 2554.69 Standard Deviation 160.787
Sample Variance 5.02138E+11 Sample Variance 6526453 Sample Variance 25852.45
Kurtosis -0.789626398 Kurtosis -0.7573 Kurtosis -0.85365
Skewness 0.37785518 Skewness -0.8196 Skewness -0.04613
Range 2332274 Range 7392 Range 512
Minimum 3982591 Minimum 2768 Minimum 29
Maximum 6314865 Maximum 10160 Maximum 541
Sum 60429545 Sum 88915 Sum 3234
Count 12 Count 12 Count 12
Confidence Level(95.0%) 450233.5293 Confidence Level(95.0%) 1623.17 Confidence Level(95.0%) 102.1592

4
Apendix 1 – Excel

4. Simple linear regression

 Data → Data Analysis → Regression

 Input Y Range – the dependent, the explained variable. In our case Arrivals: B1:B13;
 Input X Range – the idependent, the explanatory variable, In our case GDP: C1:C13;
 Select Labels;
 Select Output Range: I1;
 OK.

5
Apendix 1 – Excel

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.806955069
R Square 0.651176484
Adjusted R Square 0.616294132
Standard Error 438945.4619
Observations 12

ANOVA
df SS MS F Significance F
Regression 1 3.59678E+12 3.6E+12 18.66779 0.001511637
Residual 10 1.92673E+12 1.93E+11
Total 11 5.52351E+12

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 3377293.839 404230.4469 8.354872 8.04E-06 2476612.279 4277975.399 2476612.279 4277975.399
PIB 223.8319624 51.8054709 4.320624 0.001512 108.4021804 339.2617444 108.4021804 339.2617444
Arrivals=3377293.8+223.83*GDP
sb1=51.80
t1 =4.3206
R2=0.6511
p-value=0.0015
RSS=1.92*1012

5. The multiplicative model (log-linear)


 Input Y Range - ln(Arrivals): E1:E13;
 Input X Range – ln(GDP): F1:F13.
SUMMARY OUTPUT

Regression Statistics
Multiple R 0.807289745
R Square 0.651716733
Adjusted R Square 0.616888406
Standard Error 0.086375703
Observations 12

ANOVA
df SS MS F Significance F
Regression 1 0.139607726 0.139608 18.71226 0.001499481
Residual 10 0.074607621 0.007461
Total 11 0.214215347

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 13.12880382 0.530968194 24.72616 2.67E-10 11.94573297 14.31187468 11.94573297 14.31187468
ln(PIB) 0.25963223 0.060019928 4.325767 0.001499 0.125899496 0.393364964 0.125899496 0.393364964

6
Apendix 1 – Excel

6. How to compare models using RSS

Linear model (Arrivals*-L) –Input Y Range H1:H13

7
Apendix 1 – Excel

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.806955069
R Square 0.651176484
Adjusted R Square 0.616294132
Standard Error 0.087950238
Observations 12

ANOVA
df SS MS F Significance F
Regression 1 0.144399931 0.1444 18.66779 0.001511637
Residual 10 0.077352443 0.007735
Total 11 0.221752374

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 0.676698636 0.08099449 8.354872 8.04E-06 0.496231667 0.857165605 0.496231667 0.857165605
PIB 4.48486E-05 1.03801E-05 4.320624 0.001512 2.17202E-05 6.79769E-05 2.17202E-05 6.79769E-05

RSSliniar=0,07735

Exponential model (ln(Arrivals*)- GDP) – Input Y Range I1:I13

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.827877132
R Square 0.685380546
Adjusted R Square 0.653918601
Standard Error 0.082095259
Observations 12

ANOVA
df SS MS F Significance F
Regression 1 0.146819032 0.146819 21.78443 0.000884326
Residual 10 0.067396316 0.00674
Total 11 0.214215347

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept -0.33508117 0.075602566 -4.43214 0.00127 -0.503534187 -0.16662816 -0.503534187 -0.166628159
PIB 4.52227E-05 9.68909E-06 4.667379 0.000884 2.3634E-05 6.68113E-05 2.3634E-05 6.68113E-05

RSSexponential=0,06739

8
Apendix 1 – Excel

7. Multiple linear regression

The only difference is that in Input X Range you have to introduce C1:D13.

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.836658726
R Square 0.699997825
Adjusted R Square 0.633330674
Standard Error 429090.2368
Observations 12

ANOVA
df SS MS F Significance F
Regression 2 3.86645E+12 1.93E+12 10.49989 0.004436697
Residual 9 1.65707E+12 1.84E+11
Total 11 5.52351E+12

Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0% Upper 95.0%
Intercept 3499250.585 407801.7721 8.580764 1.26E-05 2576738.887 4421762.283 2576738.887 4421762.283
PIB 153.4307248 77.12758717 1.989311 0.077883 -21.04399861 327.9054482 -21.04399861 327.9054482
ISD 1483.06898 1225.455227 1.210219 0.25702 -1289.103335 4255.241295 -1289.103335 4255.241295
Arrivals=3499251+153,43*GDP+1483,07*ISD
sb1=77,12 sb2=1225,45
t1=1,989 t2=1,210 (p1=0.0778 p2=0,2570)
F=10,49 (p=0,0044)
R2=0,6999
R2adj=0,6333

8. Regression analysis in practice

9
Apendix 1 – Excel

RESIDUAL OUTPUT

Observation Predicted Sosiri Residuals


1 3996860.711 -14269.71055
2 4169882.817 169575.1825
3 4427289.574 50646.42576
4 4677981.372 47466.62788
5 5195704.701 16465.29887
6 5651426.577 -406134.5766
7 5217192.57 -677334.5695
8 5229950.991 -644739.9914
9 5439457.708 -81694.70818
10 5293295.437 447008.5633
11 5519365.719 389283.2812
12 5611136.823 703728.1767

8.1.1. Graphical methods

Residuals t Residuals t-1


-14269.71055
169575.1825 -14269.71055
50646.42576 169575.1825
47466.62788 50646.42576
16465.29887 47466.62788
-406134.5766 16465.29887
-677334.5695 -406134.5766
-644739.9914 -677334.5695
-81694.70818 -644739.9914
447008.5633 -81694.70818
389283.2812 447008.5633
703728.1767 389283.2812

10
Apendix 1 – Excel

Residualst vs Residualst-1
800000

600000

400000

200000

0
-800000 -600000 -400000 -200000 0 200000 400000 600000
-200000

-400000

-600000

-800000

8.1.2. Formal test

H0: nu există autocorelație;


H1: există autocorelație.

 Runs test
Residuals
-14269.7
169575.2
50646.43
47466.63
16465.3
-406135
-677335
-644740
-81694.7
447008.6
389283.3
703728.2

 Durbin-Watson test

H0*: nu există autocorelație pozitivă;


H0 : nu există autocorelație negativă.

11
Apendix 1 – Excel

εt εt-1 (εt-εt-1)2 εt2


-14269.7
169575.2 -14269.7 33798944708 28755742526
50646.43 169575.2 14144049183 2565060443
47466.63 50646.43 10111114.56 2253080763
16465.3 47466.63 961082400.8 271106066.8
-406135 16465.3 1.78591E+11 1.64945E+11
-677335 -406135 73549436179 4.58782E+11
-644740 -677335 1062406524 4.1569E+11
-81694.7 -644740 3.1702E+11 6674025345
447008.6 -81694.7 2.79527E+11 1.99817E+11
389283.3 447008.6 3332208184 1.51541E+11
703728.2 389283.3 98875592255 4.95233E+11
1.00087E+12 1.92653E+12
1.00087
So d   0.51946.
1.92653

Remedial mesures

1. =1;

12
Apendix 1 – Excel

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.741072748
R Square 0.549188817
Adjusted R Square 0.47226574
Standard Error 2780.793194
Observations 14

ANOVA
df SS MS F Significance F
Regression 1 122463802.8 1.22E+08 15.83691 0.001827732
Residual 13 100526540.3 7732811
Total 14 222990343

Coefficients Standard Error t Stat P-value


Intercept 0 #N/A #N/A #N/A
X Variable 1 46.30571527 11.63588495 3.979561 0.001571

Applying a Runs test for the new model, we have:


RESIDUAL OUTPUT

Observation Predicted Y Residuals


1 166897.4941 189969.5059
2 248295.1076 -109817.1076
3 241817.8439 5694.15607
4 499397.0295 -12675.02947
5 439590.2949 -406468.2949
6 -418863.0511 -286570.9489
7 12306.80099 33046.19901
8 202090.6267 570461.3733
9 -140988.4394 523529.4394
10 218067.8771 -49722.87712
11 88522.60358 317693.3964

(+), (-), (+), (- - -), (+ + +), (-), (+), k=7, N1=6, N2=5, n1=3 și n2=10. Because k≥n1 and k≤n2 in the
new model the residuals are not correlated.

2. 1 – d/2=0,74027

13
Apendix 1 – Excel

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.585360723
R Square 0.342647176
Adjusted R Square 0.269607973
Standard Error 323023.4298
Observations 11

ANOVA
df SS MS F Significance F
Regression 1 4.89507E+11 4.9E+11 4.691278 0.058499388
Residual 9 9.39097E+11 1.04E+11
Total 10 1.4286E+12

Coefficients Standard Error t Stat P-value


Intercept 949257.3138 267884.3557 3.543534 0.006279
X Variable 1 214.537597 99.05076757 2.165936 0.058499

14
Apendix 1 – Excel

Applying a Runs test for the new model, we have:

RESIDUAL OUTPUT

Observation Predicted Y Residuals


1 1269332.957 121932.4032
2 1393286.621 -127721.1951
3 1450930.621 -40364.30382
4 1769282.446 -55219.83734
5 1838740.171 -451861.2571
6 1099188.372 -442262.6812
7 1419519.564 -195029.246
8 1611274.258 352194.5954
9 1322529.668 451583.1164
10 1642919.323 16354.83458
11 1570475.834 370393.5711

(+), (- - - - - -), (+ + + +) k=3, N1=5, N2=6, n1=3 și n2=10. Because k≥n1 and k≤n2,in the new model
the residuals are not correlated.
3. t= t-1+vt (regression through origin).

Coefficients Standard Error t Stat P-value


Intercept 0 #N/A #N/A #N/A
X Variable 1 0.823317266 0.258449433 3.185603 0.009726

So, ρ=0.823317.

15
Apendix 1 – Excel

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.569726689
R Square 0.3245885
Adjusted R Square 0.249542778
Standard Error 319542.785
Observations 11

ANOVA
df SS MS F Significance F
Regression 1 4.41637E+11 4.42E+11 4.32521 0.067296378
Residual 9 9.18968E+11 1.02E+11
Total 10 1.36061E+12

Coefficients Standard Error t Stat P-value


Intercept 692740.6083 210177.8062 3.295974 0.009289
X Variable 1 201.9650187 97.11192438 2.079714 0.067296

Applying a Runs test for the new model, we have:

16
Apendix 1 – Excel

RESIDUAL OUTPUT

Observation Predicted Y Residuals


1 947632.2857 112890.8399
2 1051356.664 -146170.2061
3 1086334.067 -47646.90024
4 1367244.173 -45615.84449
5 1393836.459 -439812.6268
6 663475.7317 -442155.8052
7 997573.3303 -150104.5993
8 1177134.558 405546.2776
9 889612.166 439554.4742
10 1202178.48 -19619.34854
11 1117040.092 333133.7391
(+), (- - - - - -), (+ +), (-), (+), k=5, N1=4, N2=7, n1=2 și n2=9.. Because k≥n1 and k≤n2,in the new
model the residuals are not correlated.

17

You might also like