Lecture 4 Regression Analysis
Lecture 4 Regression Analysis
Simple
Pearson R.
Regression
Correlation
Analysis
Relationsh
ip Spearman
Rho Chi- Square
Correlation
1800
• Determine the straight line for 1600
Sales
1000
600
regression (Y-hat) are as small 400
as possible. 200
0
0 10 20 30 40 50 60 70 80
Assumptions Test
Linearity Scatter Diagram
No Autocorrelation Durbin-Watson
Normality of Errors Jarque-Bera
No Heteroskedasticity White's Test or Breusch-Pagan Test
No Perfect Multicollinearity Variance Inflation Factor (VIF)
Multicollinearity is applicable only in Multiple Regression
24
The Equation Object
• Single equation regression estimation in EViews is performed using the Equation
Object.
There are a number of ways to create a simple OLS Equation Object:
3.The sample.
Specify your sample
26
Specifying an Equation by List
• The easiest way to specify a linear equation is to provide a list of variables
that you wish to use in the equation.
• Suppose that you would like
to know how well income
explains financial wealth.
• To accomplish this, type in
the Equation Estimation box
1. The dependent variable
(wealth).
2. “c” for constant.
3. The independent variable
(income).
Notice that all the entries
are all separated by spaces.
27
Specifying Equation by List (cont’d)
To create an equation:
1.Select wealth and
income by clicking on
these series in the workfile
(press CTRL to select
multiple series). Notice
that you need to select the
dependent variable
(wealth) first.
2.Right click and select
Open → as Equation. 28
Specifying Equation by List (cont’d)
29
Estimation Method
• After you specify the variable list,
you need to select an estimation
method.
• Click on the Method option, and
you see a drop-down menu listing
the various estimation method you
can perform in EViews.
• Standard, single equation
regression, is performed using
“Least Squares” (LS).
• In this tutorial we will use Least
Squares and defer discussion of
more advanced estimation
techniques in subsequent tutorials.
30
Estimation Sample
• The third item you need to specify in the equation box is the Sample.
• You can specify the sample period in the sample space of the equation box.
For example, to estimate the following regression over the entire sample:
1. You need to include all observations (1 to 9275).
2. Click OK to estimate the regression. As seen in Equation Output, EViews has
included all observations when estimating this regression.
32
Estimation Sample (cont’d)
• What if you want to estimate the effect of income on wealth, but only for a
subset of individuals, e.g., married men?
To target a specific sample you need to:
1. Specify the sample: in this case male and married so type: if married=1 and male=1.
2. Click OK to estimate the regression. As seen in Equation Output, EViews has included
only a subset of total observations (534 obs.)
33
SIMPLE LINEAR REGRESSION
Dependent Variable: WEALTH
Method: Least Squares
Date: 04/05/25 Time: 10:44 Start with the p-value of F-statistics
Sample: 1 9275
Included observations: 9275 p-value of F-statistics should be <.05 in order to
say that the whole model is significant.
Variable Coefficient Std. Error t-Statistic Prob.
* e.g. The F-statistics is 1532.385 with p-value of
C -20.17948 1.176434 -17.15309 0.0000 <.001, thus the whole model is significant.
INCOME 0.999911 0.025543 39.14569 0.0000
Independent Error
Variable/ term
Predictor
SIMPLE LINEAR REGRESSION
Dependent Variable: WEALTH
Method: Least Squares
Date: 04/05/25 Time: 10:44
Sample: 1 9275
Included observations: 9275 Check the p-value (Prob) of t-statistic of
Variable Coefficient Std. Error t-Statistic Prob. the intercept (C) and interpret its
C -20.17948 1.176434 -17.15309 0.0000
coefficient
INCOME 0.999911 0.025543 39.14569 0.0000 p-value of t-statistics should be <.05 in order to
say that the coefficient of the intercept is
R-squared 0.141817 Mean dependent var 19.07168
Adjusted R-squared 0.141724 S.D. dependent var 63.96384 significant and included in the regression model.
S.E. of regression 59.25813 Akaike info criterion 11.00190 • e.g. The t-statistics of intercept (c) is -17.15309
Sum squared resid 32562380 Schwarz criterion 11.00344 with p-value of <.001, thus the coefficient of
Log likelihood -51019.30 Hannan-Quinn criter. 11.00242
F-statistic 1532.385 Durbin-Watson stat 1.939237 the intercept (-20.17948) is significant.
Prob(F-statistic) 0.000000 • This means that when income is zero, the
wealth of individual is equal to -20.17948
Variable Coefficient Std. Error t-Statistic Prob. Variable Coefficient Std. Error t-Statistic Prob. Variable Coefficient Std. Error t-Statistic Prob.
C 3.523810 0.731100 4.819874 0.0000 C 2.750000 0.655948 4.192404 0.0000 C 3.624317 0.567325 6.388434 0.0000
AFFECTIVE_CYNICISM -0.279221 0.141666 -1.970976 0.0503 AFFECTIVE_CYNICISM -0.166667 0.145062 -1.148931 0.2521 AFFECTIVE_CYNICISM -0.316940 0.175897 -1.801853 0.0732
BEHAVIORAL_CYNICISM -0.326840 0.082393 -3.966851 0.0001 BEHAVIORAL_CYNICISM 1.82E-10 0.113380 1.60E-09 1.0000BEHAVIORAL_CYNICISM 0.087432 0.120578 0.725106 0.4693
COGNITIVE_CYNICISM 0.270563 0.181384 1.491659 0.1375 COGNITIVE_CYNICISM 0.416667 0.149772 2.782013 0.0060 COGNITIVE_CYNICISM 0.321038 0.100721 3.187410 0.0017
Pseudo R-squared 0.219526 Mean dependent var 3.769928 Pseudo R-squared 0.081092 Mean dependent var 3.769928Pseudo R-squared 0.111427 Mean dependent var 3.769928
Adjusted R-squared 0.206519 S.D. dependent var 0.764792 Adjusted R-squared 0.065777 S.D. dependent var 0.764792Adjusted R-squared 0.096617 S.D. dependent var 0.764792
S.E. of regression 0.878479 Objective 38.17817 S.E. of regression 0.677153 Objective 48.39583S.E. of regression 0.854605 Objective 37.39413
Quantile dependent var 3.166667 Restr. objective 48.91667 Quantile dependent var 4.000000 Restr. objective 52.66667Quantile dependent var 4.000000 Restr. objective 42.08333
Sparsity 1.950619 Quasi-LR statistic 58.72188 Sparsity 1.772560 Quasi-LR statistic 19.27532Sparsity 2.119847 Quasi-LR statistic 23.59519
Prob(Quasi-LR stat) 0.000000 Prob(Quasi-LR stat) 0.000240 Prob(Quasi-LR stat) 0.000030
PERFORMING THE REGRESSION ANALYSIS IN SPSS
1. Go to Analyze -> Regression
-> Linear
2. Click the Statistic-> Put
check mark the “Model Fit”,
“R Square Change”,
Collinearity Diagnostics and
“Durbin Watson” then
“Continue”.
3. Click the “Plots”-> transfer
the “*ZRESID” on the Y box
then “*ZPRED” on the X box
-> put check mark on
“Histogram” and “Normal
Probability plot” then click
“Continue”
4. Click “OK”
CHECKING THE ASSUMPTIONS
Adjusted R square indicates the percentage on how strong the association between DV and
IDV. In this case, 46% of the variation/ changes in Sales can be explained by Radio
Advertisement
If p-value is <.05, the null hypothesis of there is no significant
relationship/ association between the dependent variable and
independent variable/s is rejected, otherwise, failed to reject and no
association at all.
Note: If it is > to .05, no need to interpret the other parts
If p-value is <.05, the null hypothesis of the stated as that
“the derived coefficient in the model is not significant” is
rejected, otherwise, failed to reject and no association at all.
Note: If it is > to .05, no need to interpret the derived
coefficient (B)
Intercept= 699.96 means that when the radio ad is equal to zero (0), the sales is equal to $699.96
Slope/ Coefficient of Independent Variable = 12.162 means that for every one (1) unit increase in
radio ad, there is corresponding increase of $12.16 in sales.
MULTIPLE LINEAR REGRESSION
In multiple linear regression analysis, the same concept will be applied however, there
is one additional assumption need to satisfy; there is no multicollinearity exist in the
model.