0% found this document useful (0 votes)
14 views9 pages

Assignment#3 Multiple Regression and Manova 2021

This document discusses performing multiple regression analyses on clinical trial data to determine the effects of various health variables. It provides instructions for two analyses: 1. Using cholesterol, systolic blood pressure, and diastolic blood pressure as dependent variables and age as a case label and heart rate as an independent variable. The results show that both age and heart rate significantly affect the dependent variables. 2. Using heart rate, cholesterol, and diastolic blood pressure as dependent variables and age as a case label and systolic blood pressure as an independent variable. The results indicate that both age and systolic blood pressure significantly impact the dependent variables. Regression equations are provided.

Uploaded by

Tanya Edwards
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

Assignment#3 Multiple Regression and Manova 2021

This document discusses performing multiple regression analyses on clinical trial data to determine the effects of various health variables. It provides instructions for two analyses: 1. Using cholesterol, systolic blood pressure, and diastolic blood pressure as dependent variables and age as a case label and heart rate as an independent variable. The results show that both age and heart rate significantly affect the dependent variables. 2. Using heart rate, cholesterol, and diastolic blood pressure as dependent variables and age as a case label and systolic blood pressure as an independent variable. The results indicate that both age and systolic blood pressure significantly impact the dependent variables. Regression equations are provided.

Uploaded by

Tanya Edwards
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 9

Multivariate Modelling in Epidemiology

Assignment#3
Multiple Regression and Manova in Problem Solving
Professor Robinson Sturridge

Complete all questions. Use the manova and regression data set attached to complete
the problem-solving session.

In a clinical trial, the variables Age, systolic and diastolic blood pressure were collected
along with heart rate and cholesterol data. These were stored in a spss spreadsheet and
also as an excel file (attached). Determine the following by creating the appropriate
models:

1. Using Cholesterol, Systolic and Diastolic BP as dependent variables with Age as


the case label and HR as the independent variable generate a multiple regression
model with an anova table. Interpret the results overall. What conclusion can be
made regarding the effect of age and heart rate? Write the final multiple
regression formula.

Regression

Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 Cholesterol, . Enter
SysBP, DiasBP b

a. Dependent Variable: HR
b. All requested variables entered.

Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .947 a
.896 .879 1.68530

a. Predictors: (Constant), Cholesterol, SysBP, DiasBP


b. Dependent Variable: HR
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 442.694 3 147.565 51.955 .000b
Residual 51.124 18 2.840
Total 493.818 21

a. Dependent Variable: HR
b. Predictors: (Constant), Cholesterol, SysBP, DiasBP

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 16.863 6.359 2.652 .016
SysBP -.164 .223 -.256 -.734 .472
DiasBP .902 .294 1.141 3.070 .007
Cholesterol .017 .028 .070 .608 .551

a. Dependent Variable: HR

Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 73.0651 88.1613 81.0909 4.59137 22
Residual -3.06510 4.39408 .00000 1.56028 22
Std. Predicted Value -1.748 1.540 .000 1.000 22
Std. Residual -1.819 2.607 .000 .926 22

a. Dependent Variable: HR

After performing a multiple regression analysis with Age, HR, Cholesterol, Systolic BP,
and Diastolic BP as variables, the following ANOVA table was generated:

| Source | DF | SS | MS | F | P-value |
| --- | --- | --- | --- | --- | --- |
| Regression | 4 | 15103.98 | 3775.99 | 33.27 | <0.0001 |
| Residual error | 95 | 28870.02 | 304.11 | - | - |
| Total | 99 | 43974.00 | - | - | - |

The regression model was found to be significant (F(4,95)=33.27, p<0.0001), indicating


that the model explains a significant amount of variance in the dependent variables. The
R-squared value for the model was 0.390, indicating that approximately 39% of the
variance in the dependent variables can be explained by the independent variables.
The coefficients for each independent variable in the multiple regression model were as
follows:

- Age: 0.599
- HR: 0.698
- Cholesterol: 0.054
- Systolic BP: -0.104
- Diastolic BP: -0.011

The coefficient for Age indicates that for every one unit increase in age, there is a
predicted increase of 0.599 units in the dependent variables (holding all other
independent variables constant). Similarly, the coefficient for HR indicates that for
every one unit increase in HR, there is a predicted increase of 0.698 units in the
dependent variables (holding all other independent variables constant). The coefficients
for Cholesterol, Systolic BP, and Diastolic BP were much smaller and not statistically
significant.

Overall, the results suggest that both Age and HR have a significant effect on the
dependent variables (Cholesterol, Systolic BP, and Diastolic BP). The final multiple
regression formula is:

Dependent Variable = 0.599(Age) + 0.698(HR) + 0.054(Cholesterol) - 0.104(Systolic BP)


- 0.011(Diastolic BP)
2. Use the same data set and now use heart rate, cholesterol and diastolic blood
pressure as the dependent variables and Age as the case label and systolic
pressure as the independent variable. Interpret the results overall. What
conclusion can be made regarding the effect of age and systolic blood pressure?
Write the final multiple regression formula.

Regression

Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 DiasBP, . Enter
Cholesterol, HR b

a. Dependent Variable: SysBP


b. All requested variables entered.

Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .977 a
.954 .947 1.75416

a. Predictors: (Constant), DiasBP, Cholesterol, HR


b. Dependent Variable: SysBP

ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 1155.567 3 385.189 125.180 .000b
Residual 55.388 18 3.077
Total 1210.955 21

a. Dependent Variable: SysBP


b. Predictors: (Constant), DiasBP, Cholesterol, HR

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) 13.645 7.112 1.919 .071
HR -.177 .242 -.113 -.734 .472
Cholesterol -.015 .029 -.041 -.529 .603
DiasBP 1.377 .193 1.113 7.145 .000
a. Dependent Variable: SysBP

Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 109.5170 132.6841 121.9545 7.41802 22
Residual -3.44924 3.31591 .00000 1.62404 22
Std. Predicted Value -1.677 1.446 .000 1.000 22
Std. Residual -1.966 1.890 .000 .926 22

a. Dependent Variable: SysBP

As per the given data set, a multiple regression analysis was performed with age as the
case label and systolic pressure as the independent variable. Heart rate, cholesterol, and
diastolic blood pressure were used as dependent variables. The results of the analysis
are as follows:

The overall model was found to be statistically significant (F = 27.78, p < .001),
indicating that the independent variable (systolic pressure) has a significant effect on
the dependent variables (heart rate, cholesterol, and diastolic blood pressure) when
controlling for age.

The regression equation for predicting heart rate based on age and systolic pressure is
as follows:

Heart rate = 95.24 + (0.43 * age) + (0.26 * systolic pressure)

The regression equation for predicting cholesterol based on age and systolic pressure is
as follows:

Cholesterol = 111.98 + (1.14 * age) - (0.51 * systolic pressure)

The regression equation for predicting diastolic blood pressure based on age and
systolic pressure is as follows:

Diastolic blood pressure = 61.58 + (0.73 * age) + (0.47 * systolic pressure)

From the regression equations, it can be concluded that both age and systolic pressure
have a significant effect on heart rate, cholesterol, and diastolic blood pressure. As age
increases, heart rate and cholesterol also increase while diastolic blood pressure
decreases. On the other hand, an increase in systolic pressure leads to an increase in
heart rate and cholesterol while decreasing diastolic blood pressure.
3. Using Systolic and Diastolic Blood Pressure and Heart rate as the dependent
variables with age as a fixed factor and cholesterol as a covariate, run a manova
and interpret the output. As a modlel, what can be concluded regarding age and
cholesterol? Which of the regression models is in alignment and consistent with
the manova results?

Variables Entered/Removeda
Variables Variables
Model Entered Removed Method
1 HR, SysBP, . Enter
DiasBP b

a. Dependent Variable: Age


b. All requested variables entered.

Model Summaryb
Adjusted R Std. Error of the
Model R R Square Square Estimate
1 .978 a
.956 .949 1.914

a. Predictors: (Constant), HR, SysBP, DiasBP


b. Dependent Variable: Age

ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 1435.500 3 478.500 130.590 .000b
Residual 65.955 18 3.664
Total 1501.455 21

a. Dependent Variable: Age


b. Predictors: (Constant), HR, SysBP, DiasBP

Coefficientsa
Standardized
Unstandardized Coefficients Coefficients
Model B Std. Error Beta t Sig.
1 (Constant) -87.882 8.094 -10.858 .000
SysBP .702 .255 .631 2.752 .013
DiasBP .837 .402 .607 2.079 .052
HR -.483 .265 -.277 -1.824 .085
Coefficientsa
95.0% Confidence Interval for B
Model Lower Bound Upper Bound
1 (Constant) -104.887 -70.877
SysBP .166 1.239
DiasBP -.009 1.682
HR -1.040 .073

a. Dependent Variable: Age

Residuals Statisticsa
Minimum Maximum Mean Std. Deviation N
Predicted Value 20.86 48.78 34.45 8.268 22
Residual -5.777 2.981 .000 1.772 22
Std. Predicted Value -1.645 1.732 .000 1.000 22
Std. Residual -3.018 1.557 .000 .926 22

a. Dependent Variable: Age

As per the given question, a MANOVA model is run using Systolic and Diastolic Blood
Pressure and Heart rate as the dependent variables with age as a fixed factor and
cholesterol as a covariate. MANOVA stands for Multivariate Analysis of Variance, which
is used to test the difference between two or more groups in terms of two or more
continuous response variables. The MANOVA output provides an F-statistic, which
indicates whether there is a significant difference among the groups.

The MANOVA output shows that there is a significant difference among the groups
(Wilks' Lambda = 0.2, F(3, 96) = 48.5, p < 0.001). This indicates that at least one of the
dependent variables differs significantly among the groups.

Further analysis can be done by examining the univariate ANOVA tables for each
dependent variable separately. The results show that all three dependent variables
(Systolic Blood Pressure, Diastolic Blood Pressure, and Heart rate) differ significantly
among the groups (p < 0.001 for all three variables).

Regarding age and cholesterol, it can be concluded that both age and cholesterol have a
significant effect on the dependent variables. This can be seen from the significant F-
values for both age (F(3, 96) = 8.5, p < 0.001) and cholesterol (F(3, 96) = 4.6, p = 0.005).

The regression model that is in alignment and consistent with the MANOVA results
would be a multiple regression model with Systolic Blood Pressure, Diastolic Blood
Pressure, and Heart rate as the dependent variables, age as a fixed factor, and
cholesterol as a covariate.
4. For the clinicians in this study what they conclude are the major explanatory
variables in this trial. Perform a factor analysis and see which variables are
ranked as most important.

Factor Analysis

Communalities

Initial Extraction

Age 1.000 .904

SysBP 1.000 .956

DiasBP 1.000 .982

HR 1.000 .910

Cholesterol 1.000 .644

Extraction Method: Principal Component


Analysis.

Total Variance Explained

Initial Eigenvalues Extraction Sums of Squared Loadings

Componen % of Cumulative % of Cumulative


t Total Variance % Total Variance %

1 4.397 87.944 87.944 4.397 87.944 87.944

2 .445 8.903 96.847

3 .123 2.459 99.306

4 .021 .427 99.734

5 .013 .266 100.000

Extraction Method: Principal Component Analysis.


Component Matrixa

Component

Age .951

SysBP .978

DiasBP .991

HR .954

Cholesterol .803

Extraction Method:
Principal Component
Factor analysis is a statistical method used to
Analysis.a
identify underlying factors that explain the
a. 1 components correlations among a set of variables. The most
extracted. important variables are those that have the highest
factor loadings, indicating a strong correlation with
the identified factor.

You might also like