Chapter 4 - Multiple Regression Analysis
Chapter 4 - Multiple Regression Analysis
Y a 1 x1 2 x2 k xk u
, 1, , k are parameters
X1, X2, …,Xk are known constants
, the error terms
Several
Several Predictor
Predictor Variables
Variables
Y
Y
00 11X
X11 22 X
X 22
PP X
X PP
Dependent Independent
(response) (explanatory)
variable variables
For two independent variable, the general form of
the multiple regression equation is:
Y/ = α +b1X1 +b2X2 + u
X1 and X2 are the independent variables.
a is the Y-intercept.
b1 is the net change in Y for each unit change in X1,
holding X2 constant.
It is called a partial regression coefficient, a net
regression coefficient or just a regression
coefficient.
b2 is the net change in Y for each single unit change
in X2 holding X1 constant
Statistical Model for Multiple Regression
Correlation matrix is used in multiple regression
to estimate the parameters.
The analysis of correlation matrix is an
important step in the solution of any problem
involving many independent variables.
Correlation coefficient ( ) indicates the
relationship between variable 1 and 1 etc.
Suppose that we took our 5 randomly selected
salespeople and collected the information in
the Table 1.2
Example
You have the following statistics from the data
Use coefficient correlation to calculate
Multiple Correlation coefficient (R)
Therefore R = .9360 and it means the combine
correlation between Years in Education and
Motivation with Annual Sales is 0. 9360.
So the two variables have strong relationship
with annual sales
Multiple Regression Estimation of Parameters
Lets make a Prediction
You interviewed the potential sales person and
she had 13 years of education and the scored 49
on the Higgins Motivation scale.
How Much money would this salesperson would
bring in on annual basis?
Example
Sdy = 10.25
SDx1 = 2.75
SDx2 = 4.15
H0: U1 = μ2 = μ3 = μ4
H1: Means are not all equal
At α=0.05
Step 2. Determine the Critical Value
The appropriate critical value can be found in a
table of probabilities for the F distribution.
In order to determine the critical value of F we
need degrees of freedom, df1=k-1 and df2=N-k.
In this example, df1=k-1=4-1=3 and df2=N-k=20-
4=16. The critical value is 3.24 and the decision
rule is as follows: Reject H0 if F > Critical Value
Where k = the number of independent
comparison groups
N – Total sample Size
Step 3. Compute group Means
Compute
Next we Compute
SSE requires computing the squared differences between
each observation and its group mean. We will compute
SSE in parts. For the participants in the low calorie diet.