0% found this document useful (0 votes)

19 views

Lecture 4 Regression Analysis

This document discusses various statistical methods for testing relationships and associations, including Pearson R, Spearman Rho, Chi-square, and regression analysis. It outlines the assumptions and applications of these methods, as well as the use of software tools like Jamovi and E-views for parameter estimation. The document also provides an overview of regression analysis techniques and the steps involved in modeling and estimating relationships between variables.

Uploaded by

Gemar Perez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Lecture 4 Regression Analysis

Uploaded by

Gemar Perez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 51

LECTURE 4

TEST OF RELATION AND ASSOCIATION

BA 502: Statistics with Computer Application
Learning Objectives
At the end of the discussion, learners will be able to:
1. Explain the test of relationship and association
2. Analyze cases where the Pearson R, Spearman Rho, Chi-square and
regression analysis will be apply
3. Utilize both Jamovi and E-views in estimating the parameters
INFERENTIAL STATATISTICS
It is a statistical method that deduces from a small but
representative sample the characteristics of a bigger
population. In other words, it allows the researcher to make
assumptions about a wider group, using a smaller portion of that group
as a guideline.
SOME OF INFERENTIAL STATATISTICS TOOLS (UNIVARIATE)

Simple
Pearson R.
Regression
Correlation
Analysis
Relationsh
ip Spearman
Rho Chi- Square
Correlation

Leading Innovations, Transforming Lives, Building the Nation

PEARSON PRODUCT MOMENT CORRELATION

The Pearson product-moment correlation coefficient (or Pearson correlation

coefficient, for short) is a measure of the strength of a linear association between two
variables and is denoted by r. Basically, a Pearson product-moment correlation attempts to
draw a line of best fit through the data of two variables, and the Pearson correlation
coefficient, r, indicates how far away all these data points are to this line of best fit (i.e., how
well the data points fit this new model/line of best fit).
The Pearson correlation coefficient, r, can take a range of values from +1 to -1. A value of
0 indicates that there is no association between the two variables. A value greater than 0
indicates a positive association; that is, as the value of one variable increases, so does the
value of the other variable. A value less than 0 indicates a negative association; that is, as the
value of one variable increases, the value of the other variable decreases. This is shown in the
diagram below:

Leading Innovations, Transforming Lives, Building the Nation

PEARSON PRODUCT MOMENT CORRELATION
The stronger the association of the two variables, the closer the Pearson correlation
coefficient, r, will be to either +1 or -1 depending on whether the relationship is positive or
negative, respectively. Achieving a value of +1 or -1 means that all your data points are included
on the line of best fit – there are no data points that show any variation away from this line.
Values for r between +1 and -1 (for example, r = 0.8 or -0.4) indicate that there is variation
around the line of best fit. The closer the value of r to 0 the greater the variation around the line
of best fit. Different relationships and their correlation coefficients are shown in the diagram
below:

Leading Innovations, Transforming Lives, Building the Nation

PEARSON PRODUCT MOMENT CORRELATION
Assumptions

1. Your two variables should be measured on a continuous scale

(i.e., they are measured at the interval or ratio level).
2. Your two continuous variables should be paired, which means
that each case (e.g., each participant) has two values: one for
each variable. These "values" are also referred to as "data points".
3. There should be independence of cases, which means that the
two observations for one case
4. There should be a linear relationship between your two
continuous variables.
5. There should be homoscedasticity, which means that the
variances along the line of best fit remain similar as you move
along the line.
6. There should be no univariate or multivariate outliers.

Leading Innovations, Transforming Lives, Building the Nation

PEARSON PRODUCT MOMENT CORRELATION
Assumptions

Leading Innovations, Transforming Lives, Building the Nation

SPEARMAN'S RANK-ORDER CORRELATION
The Spearman's rank-order correlation is the nonparametric version of the
Pearson product-moment correlation. Spearman's correlation coefficient, (ρ, also signified
by rs) measures the strength and direction of association between two ranked variables.
You need two variables that are either ordinal, interval or ratio (see our Types of Variable
guide if you need clarification). Although you would normally hope to use a Pearson product-
moment correlation on interval or ratio data, the Spearman correlation can be used when the
assumptions of the Pearson correlation are markedly violated. However, Spearman's
correlation determines the strength and direction of the monotonic relationship between
your two variables rather than the strength and direction of the linear relationship between
your two variables, which is what Pearson's correlation determines.

Leading Innovations, Transforming Lives, Building the Nation

PEARSON PRODUCT MOMENT CORRELATION

Leading Innovations, Transforming Lives, Building the Nation

PEARSON PRODUCT MOMENT CORRELATION

Leading Innovations, Transforming Lives, Building the Nation

CHI-SQUARE TEST FOR ASSOCIATION

The Chi-Square Test for Association is used to determine if there

is any association between two variables. It is really a hypothesis test
of independence. The null hypothesis is that the two variables are
not associated, i.e., independent. The alternate hypothesis is that the
two variables are associated.
Assumptions
1. Your two variables should be measured at an ordinal or nominal
level (i.e., categorical data).
2. Your two variable should consist of two or more
categorical, independent groups

Leading Innovations, Transforming Lives, Building the Nation

Regression Analysis
Regression analysis is a statistical method used to
examine the relationship between one dependent variable
and one or more independent variables. The goal of
regression analysis is to understand the nature of the
relationship between the variables, make predictions, and
assess the strength and significance of the relationship. It
is widely used in various fields, including economics, finance,
biology, psychology, and many others.
Methods of Regression
Analysis
1. Ordinary Least Squares (OLS). OLS is the most widely used method for
linear regression. It minimizes the sum of squared differences between the
observed and predicted values.
2. Weighted Least Squares (WLS). WLS is an extension of OLS that assigns
different weights to different observations. This is useful when there is
heteroscedasticity, meaning that the variance of the errors is not constant
across all levels of the independent variable.
3. Ridge Regression. Ridge regression introduces a regularization term to the
OLS objective function. It is used to prevent overfitting by penalizing large
coefficients. This is especially useful when dealing with multicollinearity.
Methods of Regression
4. Lasso
Analysis
Regression. Lasso regression, like ridge regression, adds a
regularization term, but in this case, it includes an absolute value penalty.
Lasso can lead to some coefficients being exactly zero, effectively
performing variable selection.
5. Stepwise Regression. Stepwise regression is an automated variable
selection method that systematically adds or removes variables from the
model based on statistical criteria. It can be forward, backward, or a
combination of both.
6. Quantile Regression. Quantile regression estimates the relationship
between variables at different quantiles of the conditional distribution of the
dependent variable. It is useful when the relationship varies across different
parts of the distribution.
Methods of Regression
7.
Analysis
Bayesian Regression. Bayesian regression incorporates Bayesian
principles into the modeling process. It provides a posterior distribution for
the parameters, allowing for the incorporation of prior knowledge.
8. Nonlinear Regression. Nonlinear regression is used when the
relationship between the dependent and independent variables is not
linear. The model is specified as a nonlinear function of the parameters.
9. Logistic Regression. Logistic regression is used for binary or multinomial
classification problems. It models the probability of an event occurring
based on one or more predictor variables.
10. Poisson Regression. Poisson regression is used for count data,
assuming that the response variable follows a Poisson distribution.
Regression Modeling Steps
• Define problem or question
• Specify model
• Collect data
• Do descriptive data analysis
• Estimate unknown parameters
• Evaluate model
• Use model for prediction
METHOD OF LEAST SQUARES
• The straight line that best fits
the data. 2000

1800
• Determine the straight line for 1600

which the differences between 1400

f(x) = 12.162044198895 x + 699.957182320442

the actual values (Y) and the 1200

values that would be predicted

Sales
1000

from the fitted line of 800

600
regression (Y-hat) are as small 400

as possible. 200

0
0 10 20 30 40 50 60 70 80

Radio and TV Ads

GOAL
Develop a statistical model that can predict the values of a dependent (response)
variable based upon the values of the independent (explanatory) variables.

SIMPLE REGRESSION MULTIPLE REGRESSION

•  represents the unit change in Y per unit • i represents the unit change in Y per unit change
change in X in Xi.
• Does not take into account any other • Takes into account the effect of other i s.
variable besides single independent • “Net regression coefficient.”
variable. • A statistical model that utilizes two or more
• A statistical model that utilizes one quantitative and qualitative explanatory variables
quantitative independent variable “X” to (x1,..., xp) to predict a quantitative dependent
predict the quantitative dependent variable Y.
variable “Y.” Caution: have at least two or more quantitative
explanatory variables (rule of thumb)
SIMPLE LINEAR REGRESSION

Simple linear regression is a statistical method that allows us to

summarize and study relationships between two continuous
(quantitative) variables:
• One variable, denoted x, is regarded as
the predictor, explanatory, or independent variable.
• The other variable, denoted y, is regarded as
the response, outcome, or dependent variable.
Simple linear regression gets its adjective "simple," because it
concerns the study of only one independent variable relates to one
dependent variable.
Leading Innovations, Transforming Lives, Building the Nation
Assumptions
• Linearity. The relationship between the dependent variable and the
independent variables is assumed to be linear. The model assumes
that changes in the independent variables lead to proportional
changes in the dependent variable.
• No autocorrelation. The errors are assumed to be independent of
each other, meaning that there should be no systematic pattern in the
residuals over time or across observations.
• Homoscedasticity. The variance of the errors should be constant
across all levels of the independent variables. In other words, the
spread of the residuals should be roughly constant as you move
along the range of predicted values.

Leading Innovations, Transforming Lives, Building the Nation

Assumptions

• Normality of Errors. The errors are assumed to be normally

distributed. While this assumption is not crucial for large sample sizes
due to the Central Limit Theorem, it is important for small sample
sizes to ensure the validity of statistical inference, such as hypothesis
testing and confidence intervals.
• No Perfect Multicollinearity. The independent variables should not
be perfectly correlated. Perfect multicollinearity occurs when one
independent variable can be exactly predicted from another, leading
to difficulties in estimating the individual coefficients.

Leading Innovations, Transforming Lives, Building the Nation

Test of Assumptions

Assumptions Test
Linearity Scatter Diagram
No Autocorrelation Durbin-Watson
Normality of Errors Jarque-Bera
No Heteroskedasticity White's Test or Breusch-Pagan Test
No Perfect Multicollinearity Variance Inflation Factor (VIF)
Multicollinearity is applicable only in Multiple Regression

Leading Innovations, Transforming Lives, Building the Nation

Basic Regression Analysis
• EViews has a very powerful and easy-to-use estimation toolkit
that allows you to estimate from the simplest to the most complex
regression analysis.
• This tutorial explains basic regression techniques in EViews for
single equation regressions using cross-section data.
• The main topics include:
 Specifying and estimating an equation
 Equation Objects (saving, labeling, freezing, printing)
 Equation Output: Analyzing and Interpreting results
 Multiple Regression Analysis
 Estimation with Data Expressions and Functions
 Post Estimation: Working with Equations
 Hypothesis testing
 Estimation Options (robust standard errors, weighted least squares) 23
Description of Data File
• Data_wf1 has the following data on 9,275 individuals*

Wealth – net total financial wealth (in thousands of dollars)

Income – annual income (in thousands of dollars)
Male – dummy variable, equal to 1 if male, 0 otherwise
Married – dummy variable, equal to 1 if married, 0 otherwise
Age – age in years (minimum age in the dataset is 25 years).
Fsize – family size; number of individuals living in the family.

* This data is from Wooldridge, Introductory Econometrics (4th Edition).

24
The Equation Object
• Single equation regression estimation in EViews is performed using the Equation
Object.
There are a number of ways to create a simple OLS Equation Object:

2. From the Main menu, select Quick → Estimate

Equation.

1. From the Main menu, select

Object → New Object →
Equation. 25
The Equation Box

• In all cases, the Equation

Estimation box appears.
Specify your equation either
by:
a) List
You need to specify three b) Formula (explained in
future tutorials)
things in this dialogue box:
1.The equation specification. Specify your estimation
2.The estimation method. method

3.The sample.
Specify your sample

26
Specifying an Equation by List
• The easiest way to specify a linear equation is to provide a list of variables
that you wish to use in the equation.
• Suppose that you would like
to know how well income
explains financial wealth.
• To accomplish this, type in
the Equation Estimation box
1. The dependent variable
(wealth).
2. “c” for constant.
3. The independent variable
(income).
Notice that all the entries
are all separated by spaces.
27
Specifying Equation by List (cont’d)

• Alternatively, you can also create an Equation simply by

selecting the series and opening them as Equation.

To create an equation:
1.Select wealth and
income by clicking on
these series in the workfile
(press CTRL to select
multiple series). Notice
that you need to select the
dependent variable
(wealth) first.
2.Right click and select
Open → as Equation. 28
Specifying Equation by List (cont’d)

3.The Equation Estimation dialog box opens, listing your

independent, dependent variables, and the constant.
4.Click OK to estimate regression.

29
Estimation Method
• After you specify the variable list,
you need to select an estimation
method.
• Click on the Method option, and
you see a drop-down menu listing
the various estimation method you
can perform in EViews.
• Standard, single equation
regression, is performed using
“Least Squares” (LS).
• In this tutorial we will use Least
Squares and defer discussion of
more advanced estimation
techniques in subsequent tutorials.
30
Estimation Sample
• The third item you need to specify in the equation box is the Sample.
• You can specify the sample period in the sample space of the equation box.
For example, to estimate the following regression over the entire sample:
1. You need to include all observations (1 to 9275).
2. Click OK to estimate the regression. As seen in Equation Output, EViews has
included all observations when estimating this regression.

32
Estimation Sample (cont’d)

• What if you want to estimate the effect of income on wealth, but only for a
subset of individuals, e.g., married men?
To target a specific sample you need to:
1. Specify the sample: in this case male and married so type: if married=1 and male=1.
2. Click OK to estimate the regression. As seen in Equation Output, EViews has included
only a subset of total observations (534 obs.)

33
SIMPLE LINEAR REGRESSION
Dependent Variable: WEALTH
Method: Least Squares
Date: 04/05/25 Time: 10:44  Start with the p-value of F-statistics
Sample: 1 9275
Included observations: 9275 p-value of F-statistics should be <.05 in order to
say that the whole model is significant.
Variable Coefficient Std. Error t-Statistic Prob.
* e.g. The F-statistics is 1532.385 with p-value of
C -20.17948 1.176434 -17.15309 0.0000 <.001, thus the whole model is significant.
INCOME 0.999911 0.025543 39.14569 0.0000

R-squared 0.141817 Mean dependent var 19.07168  Followed by Adjusted R-squared

Adjusted R-squared 0.141724 S.D. dependent var 63.96384 Adjusted R2 explain the degree of association
S.E. of regression 59.25813 Akaike info criterion 11.00190
Sum squared resid 32562380 Schwarz criterion 11.00344 between dependent and independent variable.
Log likelihood -51019.30 Hannan-Quinn criter. 11.00242 * e.g. The adjusted R2 is 0.141724 which means
F-statistic 1532.385 Durbin-Watson stat 1.939237 that 14.17% of the changes in wealth is due to the
Prob(F-statistic) 0.000000
changes in income.

Leading Innovations, Transforming Lives, Building the Nation

LINEAR MODEL
Relationship between one dependent & two or more independent variables is a
linear function
Depend Populat Slope/
ent ion Coefficient of
Variabl Y- independent
e interce variable
pt

Independent Error
Variable/ term
Predictor
SIMPLE LINEAR REGRESSION
Dependent Variable: WEALTH
Method: Least Squares
Date: 04/05/25 Time: 10:44
Sample: 1 9275
Included observations: 9275  Check the p-value (Prob) of t-statistic of
Variable Coefficient Std. Error t-Statistic Prob. the intercept (C) and interpret its
C -20.17948 1.176434 -17.15309 0.0000
coefficient
INCOME 0.999911 0.025543 39.14569 0.0000 p-value of t-statistics should be <.05 in order to
say that the coefficient of the intercept is
R-squared 0.141817 Mean dependent var 19.07168
Adjusted R-squared 0.141724 S.D. dependent var 63.96384 significant and included in the regression model.
S.E. of regression 59.25813 Akaike info criterion 11.00190 • e.g. The t-statistics of intercept (c) is -17.15309
Sum squared resid 32562380 Schwarz criterion 11.00344 with p-value of <.001, thus the coefficient of
Log likelihood -51019.30 Hannan-Quinn criter. 11.00242
F-statistic 1532.385 Durbin-Watson stat 1.939237 the intercept (-20.17948) is significant.
Prob(F-statistic) 0.000000 • This means that when income is zero, the
wealth of individual is equal to -20.17948

Leading Innovations, Transforming Lives, Building the Nation

SIMPLE LINEAR REGRESSION
Dependent Variable: WEALTH
Method: Least Squares
Date: 04/05/25 Time: 10:44
Sample: 1 9275
Included observations: 9275  Check the p-value (Prob) of t-statistic of
Variable Coefficient Std. Error t-Statistic Prob. the independent variable and interpret
C -20.17948 1.176434 -17.15309 0.0000
its coefficient
INCOME 0.999911 0.025543 39.14569 0.0000 p-value of t-statistics should be <.05 in order to
say that the coefficient of the IV is significant and
R-squared 0.141817 Mean dependent var 19.07168
Adjusted R-squared 0.141724 S.D. dependent var 63.96384 included in the regression model.
S.E. of regression 59.25813 Akaike info criterion 11.00190 • e.g. The t-statistics of intercept (c) is 39.14569
Sum squared resid 32562380 Schwarz criterion 11.00344 with p-value of <.001, thus the coefficient of
Log likelihood -51019.30 Hannan-Quinn criter. 11.00242
F-statistic 1532.385 Durbin-Watson stat 1.939237 the income (0.99911) is significant.
Prob(F-statistic) 0.000000 • This means that for every one $1 increase in
income, there is corresponding increase of
0.999911 centavos in wealth.

Leading Innovations, Transforming Lives, Building the Nation

LINEAR MODEL
Dependent Variable: WEALTH
Method: Least Squares
Date: 04/05/25 Time: 10:44
Sample: 1 9275
Included observations: 9275

Variable Coefficient Std. Error t-Statistic Prob.

C -20.17948 1.176434 -17.15309 0.0000

INCOME 0.999911 0.025543 39.14569 0.0000

R-squared 0.141817 Mean dependent var 19.07168

Adjusted R-squared 0.141724 S.D. dependent var 63.96384
S.E. of regression 59.25813 Akaike info criterion 11.00190
Sum squared resid 32562380 Schwarz criterion 11.00344
Log likelihood -51019.30 Hannan-Quinn criter. 11.00242
F-statistic 1532.385 Durbin-Watson stat 1.939237
Prob(F-statistic) 0.000000
Dependent Variable: AFFECTIVE_COM
Method: Least Squares
Date: 12/10/23 Time: 08:47
Sample: 1 184
Included observations: 184

Variable Coefficient Std. Error t-Statistic Prob.

C 3.095244 0.316778 9.771016 0.0000

AFFECTIVE_CYNICISM -0.220940 0.087684 -2.519742 0.0126
BEHAVIORAL_CYNICISM -0.082642 0.066187 -1.248619 0.2134
COGNITIVE_CYNICISM 0.377496 0.077939 4.843456 0.0000

R-squared 0.251872 Mean dependent var 3.769928

Adjusted R-squared 0.239403 S.D. dependent var 0.764792
S.E. of regression 0.666992 Akaike info criterion 2.049422
Sum squared resid 80.07813 Schwarz criterion 2.119312
Log likelihood -184.5469 Hannan-Quinn criter. 2.077750
F-statistic 20.20020 Durbin-Watson stat 2.141296
Prob(F-statistic) 0.000000

Affect com = 3.095 – 0.220Affective_cy + 0.377cognitive_cy + 0.667

Acom= 3.095- 0.220Acy+0.377Ccy+ 0.667
Dependent Variable: OVERALL_A
Method: Least Squares
Date: 12/09/23 Time: 10:02
Sample: 1 184
Included observations: 184
HAC standard errors & covariance (Bartlett kernel, Newey-West fixed
bandwidth = 5.0000)

Variable Coefficient Std. Error t-Statistic Prob.

C 3.095244 0.451116 6.861299 0.0000

OVERALL_AD -0.220940 0.123543 -1.788363 0.0754
OVERALL_BD -0.082642 0.093369 -0.885120 0.3773
OVERALL_CD 0.377496 0.096549 3.909874 0.0001

R-squared 0.251872 Mean dependent var 3.769928

OVERALL_AD 1 0.65468920... -0.2454725...

LL= 1.7236 UL OVERALL_BD 0.65468920... 1 -0.2962268...
1.791 OVERALL... -0.2454725... -0.2962268... 1
Dependent Variable: AFFECTIVE_COM Dependent Variable: AFFECTIVE_COM Dependent Variable: AFFECTIVE_COM
Method: Quantile Regression (tau = 0.25) Method: Quantile Regression (Median) Method: Quantile Regression (tau = 0.75)
Date: 12/10/23 Time: 09:56 Date: 12/10/23 Time: 09:55 Date: 12/10/23 Time: 09:56
Sample: 1 184 Sample: 1 184 Sample: 1 184
ncluded observations: 184 Included observations: 184 Included observations: 184
Huber Sandwich Standard Errors & Covariance Huber Sandwich Standard Errors & Covariance Huber Sandwich Standard Errors & Covariance
Sparsity method: Kernel (Epanechnikov) using residuals Sparsity method: Kernel (Epanechnikov) using residuals Sparsity method: Kernel (Epanechnikov) using residuals
Bandwidth method: Hall-Sheather, bw=0.1183 Bandwidth method: Hall-Sheather, bw=0.17082 Bandwidth method: Hall-Sheather, bw=0.1183
Estimation successfully identifies unique optimal solution Estimation successfully identifies unique optimal solution Estimation successfully identifies unique optimal solution

Variable Coefficient Std. Error t-Statistic Prob. Variable Coefficient Std. Error t-Statistic Prob. Variable Coefficient Std. Error t-Statistic Prob.

C 3.523810 0.731100 4.819874 0.0000 C 2.750000 0.655948 4.192404 0.0000 C 3.624317 0.567325 6.388434 0.0000
AFFECTIVE_CYNICISM -0.279221 0.141666 -1.970976 0.0503 AFFECTIVE_CYNICISM -0.166667 0.145062 -1.148931 0.2521 AFFECTIVE_CYNICISM -0.316940 0.175897 -1.801853 0.0732
BEHAVIORAL_CYNICISM -0.326840 0.082393 -3.966851 0.0001 BEHAVIORAL_CYNICISM 1.82E-10 0.113380 1.60E-09 1.0000BEHAVIORAL_CYNICISM 0.087432 0.120578 0.725106 0.4693
COGNITIVE_CYNICISM 0.270563 0.181384 1.491659 0.1375 COGNITIVE_CYNICISM 0.416667 0.149772 2.782013 0.0060 COGNITIVE_CYNICISM 0.321038 0.100721 3.187410 0.0017

Pseudo R-squared 0.219526 Mean dependent var 3.769928 Pseudo R-squared 0.081092 Mean dependent var 3.769928Pseudo R-squared 0.111427 Mean dependent var 3.769928
Adjusted R-squared 0.206519 S.D. dependent var 0.764792 Adjusted R-squared 0.065777 S.D. dependent var 0.764792Adjusted R-squared 0.096617 S.D. dependent var 0.764792
S.E. of regression 0.878479 Objective 38.17817 S.E. of regression 0.677153 Objective 48.39583S.E. of regression 0.854605 Objective 37.39413
Quantile dependent var 3.166667 Restr. objective 48.91667 Quantile dependent var 4.000000 Restr. objective 52.66667Quantile dependent var 4.000000 Restr. objective 42.08333
Sparsity 1.950619 Quasi-LR statistic 58.72188 Sparsity 1.772560 Quasi-LR statistic 19.27532Sparsity 2.119847 Quasi-LR statistic 23.59519
Prob(Quasi-LR stat) 0.000000 Prob(Quasi-LR stat) 0.000240 Prob(Quasi-LR stat) 0.000030
PERFORMING THE REGRESSION ANALYSIS IN SPSS
1. Go to Analyze -> Regression
-> Linear
2. Click the Statistic-> Put
check mark the “Model Fit”,
“R Square Change”,
Collinearity Diagnostics and
“Durbin Watson” then
“Continue”.
3. Click the “Plots”-> transfer
the “*ZRESID” on the Y box
then “*ZPRED” on the X box
-> put check mark on
“Histogram” and “Normal
Probability plot” then click
“Continue”
4. Click “OK”
CHECKING THE ASSUMPTIONS

Linearity Normality Homoskedasticity

INTERPRETATION OF THE RESULT OUTPUT
Durbin-Watson determine if the model satisfies the assumption of independence of error/
residual. You need to refer to Durbin-Watson Table (
https://www.real-statistics.com/statistics-tables/durbin-watson-table/) and look for value od
Lower Limit (LL) and Upper Limit (UL). If the Durbin-Watson Result lies between the LL and
UP, then the model failed to satisfy the assumption. In this case, the D-W = 1.542 and looking
on the D-B table, with k= 1 and n=22, the LL= 0.997 and UL = 1.174, the derived D-W does not
lye between LL and UP thus is satisfies the assumption.

Adjusted R square indicates the percentage on how strong the association between DV and
IDV. In this case, 46% of the variation/ changes in Sales can be explained by Radio
Advertisement
If p-value is <.05, the null hypothesis of there is no significant
relationship/ association between the dependent variable and
independent variable/s is rejected, otherwise, failed to reject and no
association at all.
Note: If it is > to .05, no need to interpret the other parts
If p-value is <.05, the null hypothesis of the stated as that
“the derived coefficient in the model is not significant” is
rejected, otherwise, failed to reject and no association at all.
Note: If it is > to .05, no need to interpret the derived
coefficient (B)

 Intercept= 699.96 means that when the radio ad is equal to zero (0), the sales is equal to $699.96
 Slope/ Coefficient of Independent Variable = 12.162 means that for every one (1) unit increase in
radio ad, there is corresponding increase of $12.16 in sales.
MULTIPLE LINEAR REGRESSION

In multiple linear regression analysis, the same concept will be applied however, there
is one additional assumption need to satisfy; there is no multicollinearity exist in the
model.

STEPS OF MULTIPLE LINEAR REGRESSION ANALYSIS

Examine variation measures
Do residual analysis
Test parameter significance
Overall model
Portions of model
Individual coefficients
Test for multicollinearity
Multicollinearity Detecting and Remedy for
• High correlation between X
Multicollinearity
variables • Examine correlation matrix
• Coefficients measure combined • Correlations between pairs of X variables are
effect more than with Y variable
• Leads to unstable coefficients • Use VIF result
 VIF starts at 1 and has no upper limit
depending on X variables in
 VIF = 1, no correlation between the independent variable and
model the other variables
• Always exists; matter of degree  VIF exceeding 5 or 10 indicates high multicollinearity between
• Example: Using both total this independent variable and the others

number of print ads and news • Few remedies

• Obtain new sample data
paper ad as explanatory variables
• Eliminate one correlated X variable
in same model
Both Test manifest NO COLINNEARITY
EXIST

𝑆𝑎𝑙𝑒𝑠=1 2.032 𝑅𝑎𝑑𝑖𝑜 𝐴𝑑+245.23

THANK YOU 

unit 7 8614
No ratings yet
unit 7 8614
35 pages
8614.educational Statitics Unit 7
No ratings yet
8614.educational Statitics Unit 7
39 pages
Data Analysis: Parametric vs. Non-Parametric Tests
No ratings yet
Data Analysis: Parametric vs. Non-Parametric Tests
19 pages
Measures of Association: Lesson 1 Data Analysis
No ratings yet
Measures of Association: Lesson 1 Data Analysis
41 pages
Correlation and Regration
No ratings yet
Correlation and Regration
8 pages
Correlation and Regression 2
No ratings yet
Correlation and Regression 2
2 pages
Regression & Correlation 230224 221642
No ratings yet
Regression & Correlation 230224 221642
9 pages
Rabia 29
No ratings yet
Rabia 29
11 pages
Stats Unit 2
No ratings yet
Stats Unit 2
24 pages
Research Notes
No ratings yet
Research Notes
4 pages
PBH7003 Tests of relationships
No ratings yet
PBH7003 Tests of relationships
68 pages
20200519072923cce68d4cc4
No ratings yet
20200519072923cce68d4cc4
28 pages
Correlation Regression and Trend Analysis
No ratings yet
Correlation Regression and Trend Analysis
27 pages
Correlation Analysis
No ratings yet
Correlation Analysis
102 pages
Correlation and Regression Analysis
No ratings yet
Correlation and Regression Analysis
5 pages
Chapter 4-Correlation and Regresssion
No ratings yet
Chapter 4-Correlation and Regresssion
60 pages
Correlation and Simple Linear Regression Analyses: Objectives
No ratings yet
Correlation and Simple Linear Regression Analyses: Objectives
6 pages
Correlation and Regression
No ratings yet
Correlation and Regression
3 pages
4. Correlation & Regression
No ratings yet
4. Correlation & Regression
3 pages
3
No ratings yet
3
4 pages
Correlation Anad Regression
No ratings yet
Correlation Anad Regression
13 pages
SPSS Pearson R
No ratings yet
SPSS Pearson R
20 pages
Introduction to Correlation and Regression Analysis (1) (3)
No ratings yet
Introduction to Correlation and Regression Analysis (1) (3)
14 pages
Examining Relationships in Quantitative Research
No ratings yet
Examining Relationships in Quantitative Research
9 pages
Spss Tutorials: Pearson Correlation
No ratings yet
Spss Tutorials: Pearson Correlation
10 pages
Linear Regression Analysis_1
No ratings yet
Linear Regression Analysis_1
18 pages
Lesson 8
No ratings yet
Lesson 8
11 pages
Correlation (Pearson, Kendall, Spearman)
100% (1)
Correlation (Pearson, Kendall, Spearman)
4 pages
CH 12 Regression and Correlation
No ratings yet
CH 12 Regression and Correlation
6 pages
Chapter 1 - Presentation-1
No ratings yet
Chapter 1 - Presentation-1
26 pages
BStats 2
No ratings yet
BStats 2
66 pages
QT - Unit 2 - Part A - Correlation
No ratings yet
QT - Unit 2 - Part A - Correlation
48 pages
MATH 121 (Chapter 10) - Correlation & Regression
No ratings yet
MATH 121 (Chapter 10) - Correlation & Regression
30 pages
Test of Difference Correlational SP Es
No ratings yet
Test of Difference Correlational SP Es
6 pages
Statistical Tests
No ratings yet
Statistical Tests
55 pages
Correlation N Regression
No ratings yet
Correlation N Regression
25 pages
SPSS_Explained_2nd_Edition-312-339
No ratings yet
SPSS_Explained_2nd_Edition-312-339
28 pages
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
Egression & Orrelation: Nalysis
0% (1)
Egression & Orrelation: Nalysis
48 pages
Week 10: Basic Concept
No ratings yet
Week 10: Basic Concept
11 pages
07 Relation Analysis
No ratings yet
07 Relation Analysis
86 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
23 pages
Microsoft PowerPoint Session 4 PDF
No ratings yet
Microsoft PowerPoint Session 4 PDF
86 pages
CORRELATION[1]
No ratings yet
CORRELATION[1]
23 pages
Qt- Bba- Module III
No ratings yet
Qt- Bba- Module III
6 pages
Correlation and Regration
No ratings yet
Correlation and Regration
57 pages
Descriptive Analysis
No ratings yet
Descriptive Analysis
35 pages
CORRELATION AND REGRESSION ORIGINAL pptx
No ratings yet
CORRELATION AND REGRESSION ORIGINAL pptx
44 pages
Pearson's Correlation
No ratings yet
Pearson's Correlation
10 pages
unit 4 notes (1)
No ratings yet
unit 4 notes (1)
9 pages
Short Term Training Programme On Data Analytics Using SPSS and RCMDR
No ratings yet
Short Term Training Programme On Data Analytics Using SPSS and RCMDR
20 pages
Measures of Correlation Module
No ratings yet
Measures of Correlation Module
24 pages
Notes Stats
No ratings yet
Notes Stats
8 pages
UNIT-2 by Ramanathan
No ratings yet
UNIT-2 by Ramanathan
67 pages
Inferential Statistics
No ratings yet
Inferential Statistics
171 pages
2ND321 Educational Statistics
No ratings yet
2ND321 Educational Statistics
19 pages
9.bivariate Analysis
No ratings yet
9.bivariate Analysis
64 pages
unit 2
No ratings yet
unit 2
44 pages
Chap 15
No ratings yet
Chap 15
44 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
China and World Economy - 2020 - Li - Do Cryptocurrencies Increase The Systemic Risk of The Global Financial Market
No ratings yet
China and World Economy - 2020 - Li - Do Cryptocurrencies Increase The Systemic Risk of The Global Financial Market
22 pages
Gold Price Prediction Using Ensemble Based Supervised Machine Learning
100% (2)
Gold Price Prediction Using Ensemble Based Supervised Machine Learning
30 pages
Project - Report - Forest Fire Prediction - Group 119
No ratings yet
Project - Report - Forest Fire Prediction - Group 119
26 pages
A.I Seminal
No ratings yet
A.I Seminal
27 pages
Big Data
No ratings yet
Big Data
14 pages
40 Interview Questions Asked at Startups in Machine Learning - Data Science
No ratings yet
40 Interview Questions Asked at Startups in Machine Learning - Data Science
33 pages
Artificial Intelligence Activities
No ratings yet
Artificial Intelligence Activities
34 pages
Project Report
No ratings yet
Project Report
30 pages
Journal of Statistical Software: Regularization Paths For Generalized Linear Models Via Coordinate Descent
No ratings yet
Journal of Statistical Software: Regularization Paths For Generalized Linear Models Via Coordinate Descent
22 pages
Live Crypto Sentiment: Social Media Influence On Multi-Sectoral Coin and Its Impact On Portfolio Risk Management, Using Data Analytics.
No ratings yet
Live Crypto Sentiment: Social Media Influence On Multi-Sectoral Coin and Its Impact On Portfolio Risk Management, Using Data Analytics.
9 pages
Machine Learning Approach To Study Industry Return Dependencies Across Industries
No ratings yet
Machine Learning Approach To Study Industry Return Dependencies Across Industries
17 pages
Working Paper: Varun - Karamshetty@insead - Edu
No ratings yet
Working Paper: Varun - Karamshetty@insead - Edu
42 pages
(Ebook) Machine Learning with Spark and Python by Michael Bowles ISBN 9781119561958, 1119561957 instant download
100% (2)
(Ebook) Machine Learning with Spark and Python by Michael Bowles ISBN 9781119561958, 1119561957 instant download
54 pages
SSRN Id3726609
No ratings yet
SSRN Id3726609
50 pages
AI34
No ratings yet
AI34
3 pages
Machine Learning Dynamic Ensemble Methods For Sola
No ratings yet
Machine Learning Dynamic Ensemble Methods For Sola
24 pages
Machine Learning Applications For Building Structural Design and Performance Assessment
No ratings yet
Machine Learning Applications For Building Structural Design and Performance Assessment
41 pages
Shrinking the Cross Section
No ratings yet
Shrinking the Cross Section
22 pages
AML Winter 2021 Solution
No ratings yet
AML Winter 2021 Solution
6 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
1 s2.0 S1057521918304095 Main
No ratings yet
1 s2.0 S1057521918304095 Main
17 pages
Subset ARMA Selection Via The Adaptive Lasso: Kun Chen and Kung-Sik Chan
No ratings yet
Subset ARMA Selection Via The Adaptive Lasso: Kun Chen and Kung-Sik Chan
9 pages
PRJ Sales Forecasting
No ratings yet
PRJ Sales Forecasting
22 pages
ML 2024 Part6 Classification Unsupervised
No ratings yet
ML 2024 Part6 Classification Unsupervised
43 pages
Solving Recurrence Relations Using Machine Learning, With Application To Cost Analysis
No ratings yet
Solving Recurrence Relations Using Machine Learning, With Application To Cost Analysis
14 pages
Applied Methods PHD Syllabus
No ratings yet
Applied Methods PHD Syllabus
8 pages
House Rent Prediction Final
No ratings yet
House Rent Prediction Final
30 pages
Regularization in Machine Learning
No ratings yet
Regularization in Machine Learning
3 pages
Unit 2 (3)
No ratings yet
Unit 2 (3)
100 pages
Regression
No ratings yet
Regression
7 pages

Lecture 4 Regression Analysis

Uploaded by

Lecture 4 Regression Analysis

Uploaded by

LECTURE 4

TEST OF RELATION AND ASSOCIATION

Leading Innovations, Transforming Lives, Building the Nation

The Pearson product-moment correlation coefficient (or Pearson correlation

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

1. Your two variables should be measured on a continuous scale

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

The Chi-Square Test for Association is used to determine if there

Leading Innovations, Transforming Lives, Building the Nation

which the differences between 1400

the actual values (Y) and the 1200

values that would be predicted

from the fitted line of 800

Radio and TV Ads

SIMPLE REGRESSION MULTIPLE REGRESSION

Simple linear regression is a statistical method that allows us to

Leading Innovations, Transforming Lives, Building the Nation

• Normality of Errors. The errors are assumed to be normally

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

Wealth – net total financial wealth (in thousands of dollars)

* This data is from Wooldridge, Introductory Econometrics (4th Edition).

2. From the Main menu, select Quick → Estimate

1. From the Main menu, select

• In all cases, the Equation

• Alternatively, you can also create an Equation simply by

3.The Equation Estimation dialog box opens, listing your

R-squared 0.141817 Mean dependent var 19.07168  Followed by Adjusted R-squared

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

Leading Innovations, Transforming Lives, Building the Nation

Variable Coefficient Std. Error t-Statistic Prob.

C -20.17948 1.176434 -17.15309 0.0000

R-squared 0.141817 Mean dependent var 19.07168

Variable Coefficient Std. Error t-Statistic Prob.

C 3.095244 0.316778 9.771016 0.0000

R-squared 0.251872 Mean dependent var 3.769928

Affect com = 3.095 – 0.220Affective_cy + 0.377cognitive_cy + 0.667

Variable Coefficient Std. Error t-Statistic Prob.

C 3.095244 0.451116 6.861299 0.0000

R-squared 0.251872 Mean dependent var 3.769928

OVERALL_AD 1 0.65468920... -0.2454725...

Linearity Normality Homoskedasticity

STEPS OF MULTIPLE LINEAR REGRESSION ANALYSIS

number of print ads and news • Few remedies

𝑆𝑎𝑙𝑒𝑠=1 2.032 𝑅𝑎𝑑𝑖𝑜 𝐴𝑑+245.23

You might also like