Project Deliverable 3
Project Deliverable 3
1
1. Problem Statement
The primary objective of this analysis is to investigate potential factors associated
with cardiovascular disease. Specifically, we aim to explore if certain demographic
and health indicators (such as age, resting blood pressure, cholesterol levels, and
maximum heart rate) are significantly associated with the presence of cardiovascular
disease. This analysis will involve hypothesis testing to compare means and predictive
modeling to identify key risk factors.
2. Dataset Description
2.1. Dataset Overview
Here’s a list of variables from the dataset along with brief descriptions based on
typical cardiovascular datasets:
patientid: Unique identifier for each patient.
age: Patient's age in years.
gender: Gender of the patient (1 = Male, 0 = Female).
chestpain: Type of chest pain experienced (0–3, with different types indicating
various risks of heart disease).
restingBP: Resting blood pressure in mm Hg.
serumcholestrol: Serum cholesterol level in mg/dL.
fastingbloodsugar: Whether fasting blood sugar > 120 mg/dL (1 = Yes, 0 =
No).
restingrelectro: Resting electrocardiographic results (0–2, with higher values
possibly indicating abnormalities).
maxheartrate: Maximum heart rate achieved.
exerciseangia: Exercise-induced angina (1 = Yes, 0 = No).
oldpeak: ST depression induced by exercise relative to rest.
slope: The slope of the peak exercise ST segment (0–2).
noofmajorvessels: Number of major vessels (0–3) colored by fluoroscopy.
target: Outcome variable (1 = Heart disease, 0 = No heart disease).
3. Hypotheses
Based on the research objectives, we define hypotheses for the analysis. For example:
Hypothesis 1:
There is a significant difference in the mean resting blood pressure between
patients with and without heart disease.
Null Hypothesis (H0): There is no difference in resting blood pressure between
patients with and without heart disease.
Alternative Hypothesis (H1): Patients with heart disease have a different mean
resting blood pressure than those without.
Hypothesis 2:
There is a significant difference in the mean maximum heart rate between
patients with and without heart disease.
Null Hypothesis (H0): There is no difference in maximum heart rate between
patients with and without heart disease.
Alternative Hypothesis (H1): Patients with heart disease have a different mean
maximum heart rate than those without.
Hypothesis 3:
Age and cholesterol levels are associated with the risk of heart disease.
This can be tested with regression analysis where age and serum cholesterol are
predictors, and the outcome variable is the target (heart disease status).
Plots:
6. Improvement and Suggestions for Decision Making
Health Screening and Monitoring:
Given the strong associations, implement regular monitoring of resting BP and
maximum heart rate for patients, particularly those in high-risk categories, as
early indicators of cardiovascular risk.
Serum cholesterol management, including lifestyle adjustments and, if
necessary, medication, is recommended as a preventative measure against heart
disease.