Assignment#3 Multiple Regression and Manova 2021
Assignment#3 Multiple Regression and Manova 2021
Assignment#3
Multiple Regression and Manova in Problem Solving
Professor Robinson Sturridge
Complete all questions. Use the manova and regression data set attached to complete
the problem-solving session.
In a clinical trial, the variables Age, systolic and diastolic blood pressure were collected
along with heart rate and cholesterol data. These were stored in a spss spreadsheet and
also as an excel file (attached). Determine the following by creating the appropriate
models:
Regression
Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 Cholesterol, . Enter
SysBP, DiasBP b
a. Dependent Variable: HR
b. All requested variables entered.
Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .947 a
.896 .879 1.68530
a. Dependent Variable: HR
b. Predictors: (Constant), Cholesterol, SysBP, DiasBP
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 16.863 6.359 2.652 .016
SysBP -.164 .223 -.256 -.734 .472
DiasBP .902 .294 1.141 3.070 .007
Cholesterol .017 .028 .070 .608 .551
a. Dependent Variable: HR
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 73.0651 88.1613 81.0909 4.59137 22
Residual -3.06510 4.39408 .00000 1.56028 22
Std. Predicted Value -1.748 1.540 .000 1.000 22
Std. Residual -1.819 2.607 .000 .926 22
a. Dependent Variable: HR
After performing a multiple regression analysis with Age, HR, Cholesterol, Systolic BP,
and Diastolic BP as variables, the following ANOVA table was generated:
| Source | DF | SS | MS | F | P-value |
| --- | --- | --- | --- | --- | --- |
| Regression | 4 | 15103.98 | 3775.99 | 33.27 | <0.0001 |
| Residual error | 95 | 28870.02 | 304.11 | - | - |
| Total | 99 | 43974.00 | - | - | - |
- Age: 0.599
- HR: 0.698
- Cholesterol: 0.054
- Systolic BP: -0.104
- Diastolic BP: -0.011
The coefficient for Age indicates that for every one unit increase in age, there is a
predicted increase of 0.599 units in the dependent variables (holding all other
independent variables constant). Similarly, the coefficient for HR indicates that for
every one unit increase in HR, there is a predicted increase of 0.698 units in the
dependent variables (holding all other independent variables constant). The coefficients
for Cholesterol, Systolic BP, and Diastolic BP were much smaller and not statistically
significant.
Overall, the results suggest that both Age and HR have a significant effect on the
dependent variables (Cholesterol, Systolic BP, and Diastolic BP). The final multiple
regression formula is:
Regression
Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 DiasBP, . Enter
Cholesterol, HR b
Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .977 a
.954 .947 1.75416
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 1155.567 3 385.189 125.180 .000b
Residual 55.388 18 3.077
Total 1210.955 21
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 13.645 7.112 1.919 .071
HR -.177 .242 -.113 -.734 .472
Cholesterol -.015 .029 -.041 -.529 .603
DiasBP 1.377 .193 1.113 7.145 .000
a. Dependent Variable: SysBP
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 109.5170 132.6841 121.9545 7.41802 22
Residual -3.44924 3.31591 .00000 1.62404 22
Std. Predicted Value -1.677 1.446 .000 1.000 22
Std. Residual -1.966 1.890 .000 .926 22
As per the given data set, a multiple regression analysis was performed with age as the
case label and systolic pressure as the independent variable. Heart rate, cholesterol, and
diastolic blood pressure were used as dependent variables. The results of the analysis
are as follows:
The overall model was found to be statistically significant (F = 27.78, p < .001),
indicating that the independent variable (systolic pressure) has a significant effect on
the dependent variables (heart rate, cholesterol, and diastolic blood pressure) when
controlling for age.
The regression equation for predicting heart rate based on age and systolic pressure is
as follows:
The regression equation for predicting cholesterol based on age and systolic pressure is
as follows:
The regression equation for predicting diastolic blood pressure based on age and
systolic pressure is as follows:
From the regression equations, it can be concluded that both age and systolic pressure
have a significant effect on heart rate, cholesterol, and diastolic blood pressure. As age
increases, heart rate and cholesterol also increase while diastolic blood pressure
decreases. On the other hand, an increase in systolic pressure leads to an increase in
heart rate and cholesterol while decreasing diastolic blood pressure.
3. Using Systolic and Diastolic Blood Pressure and Heart rate as the dependent
variables with age as a fixed factor and cholesterol as a covariate, run a manova
and interpret the output. As a modlel, what can be concluded regarding age and
cholesterol? Which of the regression models is in alignment and consistent with
the manova results?
Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 HR, SysBP, . Enter
DiasBP b
Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .978 a
.956 .949 1.914
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 1435.500 3 478.500 130.590 .000b
Residual 65.955 18 3.664
Total 1501.455 21
Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -87.882 8.094 -10.858 .000
SysBP .702 .255 .631 2.752 .013
DiasBP .837 .402 .607 2.079 .052
HR -.483 .265 -.277 -1.824 .085
Coefficientsa
95.0% Confidence Interval for B
Model Lower Bound Upper Bound
1 (Constant) -104.887 -70.877
SysBP .166 1.239
DiasBP -.009 1.682
HR -1.040 .073
Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 20.86 48.78 34.45 8.268 22
Residual -5.777 2.981 .000 1.772 22
Std. Predicted Value -1.645 1.732 .000 1.000 22
Std. Residual -3.018 1.557 .000 .926 22
As per the given question, a MANOVA model is run using Systolic and Diastolic Blood
Pressure and Heart rate as the dependent variables with age as a fixed factor and
cholesterol as a covariate. MANOVA stands for Multivariate Analysis of Variance, which
is used to test the difference between two or more groups in terms of two or more
continuous response variables. The MANOVA output provides an F-statistic, which
indicates whether there is a significant difference among the groups.
The MANOVA output shows that there is a significant difference among the groups
(Wilks' Lambda = 0.2, F(3, 96) = 48.5, p < 0.001). This indicates that at least one of the
dependent variables differs significantly among the groups.
Further analysis can be done by examining the univariate ANOVA tables for each
dependent variable separately. The results show that all three dependent variables
(Systolic Blood Pressure, Diastolic Blood Pressure, and Heart rate) differ significantly
among the groups (p < 0.001 for all three variables).
Regarding age and cholesterol, it can be concluded that both age and cholesterol have a
significant effect on the dependent variables. This can be seen from the significant F-
values for both age (F(3, 96) = 8.5, p < 0.001) and cholesterol (F(3, 96) = 4.6, p = 0.005).
The regression model that is in alignment and consistent with the MANOVA results
would be a multiple regression model with Systolic Blood Pressure, Diastolic Blood
Pressure, and Heart rate as the dependent variables, age as a fixed factor, and
cholesterol as a covariate.
4. For the clinicians in this study what they conclude are the major explanatory
variables in this trial. Perform a factor analysis and see which variables are
ranked as most important.
Factor Analysis
Communalities
Initial Extraction
HR 1.000 .910
Component
Age .951
SysBP .978
DiasBP .991
HR .954
Cholesterol .803
Extraction Method:
Principal Component
Factor analysis is a statistical method used to
Analysis.a
identify underlying factors that explain the
a. 1 components correlations among a set of variables. The most
extracted. important variables are those that have the highest
factor loadings, indicating a strong correlation with
the identified factor.