0% found this document useful (0 votes)
50 views

Regression Tutorial

(1) The document describes the process of importing an Excel file into SPSS and setting up a regression model to analyze the relationship between GDP and several other variables. (2) It specifies importing the Excel file, setting import options, and formatting a variable for the regression. (3) The regression model is defined with GDP as the dependent variable and tax rate, life expectancy, years of schooling, unemployment, and net savings as independent variables. Output from the regression is examined to evaluate the model fit and significance of predictors.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Regression Tutorial

(1) The document describes the process of importing an Excel file into SPSS and setting up a regression model to analyze the relationship between GDP and several other variables. (2) It specifies importing the Excel file, setting import options, and formatting a variable for the regression. (3) The regression model is defined with GDP as the dependent variable and tax rate, life expectancy, years of schooling, unemployment, and net savings as independent variables. Output from the regression is examined to evaluate the model fit and significance of predictors.
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

(1) Since the original document is an Excel file, we first have to import it to SPSS: Open up SPSS and

navigate to this menu:

(2) Navigate to wherever your file is located, select it and click „OPEN”

(3) Now we need to specify a few things so that SPSS imports the data correctly. Set up the
dashboard that pops up like this. Also change the „percentage of values that determine data type” to
70 instead of 95 (we have a problematic variable, since the NET savings variable has many values
replaced by multiple periods (…) and if you leavge it on 95, SPSS will code it as a string variable that
cannot be used in regression) and click OK. Then navigate to „variable view” at the bottom left of the
page, find the NET savings variable at the bottom, and set „Decimals” to 3 instead of 15. We are good
to go.

(4) Now it’s time tor un the regression models. The first model the assignment asks for is:

GDP = [Tax rate, Life Expectancy, Years of Schooling, Unemployment, Net Saving]

This is a very rudimentary definition of a regression model. If we want to be precise, we can


reformulate it with a bit more math as the following:

GDP = β0 + β1(TaxRate) + β2(LifeExp) + β3(Years) + β4(Unemployment) + β5(NetSaving) + e

This equation just means that the predicted score of GDP for a given country is calculated from the
sum of each variable in the model (the part sin brackets) multiplied by the so-called „regression
coefficient”, also called B sor betas (β). The regression coefficients determine the direction and the
strength of the relationship between the dependent variable (GDP) and each independent variable.

To run the model, go here:


(5) Next, specify the model by placing the variables into the boxes as follows:

(6) We’ll, also select a few more things. Click „Statistics” first, and set it up like this:
(7) Click OK to run the model:

Of all the output, this is the part you will need to create the table in the assignment and to report the
regressions if needed. These are:

R: Multiple correlation coefficient, tells you how much the actual, observed values in the dataset
correlate with the predicted values from the regression. The closer this is to 1, the better fit the
model is. 0.231 is a pretty bad fit.
R-squared: Also called the coefficient of determination, it tells you the proportion of variance in the
dependent variable accounted for by the model as a whole. Here, 0.053 means that the model as a
whole is barely able to account for 5.3% of the variance in GDP, which is very bad.

Adjusted R-squared: The same as R-squared, but it controls for the fact that adding meaningless
variables to a model always improves R-squared, and adjusts the value downwards. It is always
smaller than R-squared.

F, df, and p: The F-test tells you whether overall, the model is significant or not. It actually tests the
null hypothesis that the multiple correlation coefficient, R is equal to zero in the population. YOu
report is as F(df1, df2) = XXXXXX, p = YYYY. Here, you would say „The overall model was not
statistically significant (F(5, 140) = 1.572, p = 0.172). Remember, the model is significant is p is lower
than 0.05.

Unstandardized coefficients: Also called Bs or beta weights. They indicate how much the dependent
variable changes following a single unit of change in the independent variable. Here, the numbers are
„weird” because the dependent variable (GDP) is an extremely large number, so SPSS uses scientific
notation. For life expectacy, the coefficient is reported as 4.606E+10, which tells you to move the
decimal dot 10 places to the right, so 4.606E+10 is the same as 46060000000. So every year of
expected education increases GDP by 46060000000 units.

Sig. column: These are the p-values testing whether each coefficient can be zero int he population. In
this model, all of them are larger than 0.05, so none of the independent variables are significant. This
taken together means that the model is a pretty bad fit.

Now we do the same for the remaining models, using the same steps, changing the variables
according to the assignment.

You might also like