Capstone 1 Problem Statement
Capstone 1 Problem Statement
Problem statement:
A significant public health concern is the rising cost of healthcare. Therefore, it's crucial
to be able to predict future costs and gain a solid understanding of their causes. The
insurance industry must also take this analysis seriously. This analysis may be used by
healthcare insurance providers to make a variety of strategic and tactical decisions.
Objective:
The objective of this project is to predict patients’ healthcare costs and to identify factors
contributing to this prediction. It will also be useful to learn the interdependencies of
different factors and comprehend the significance of various tools at various stages of
the healthcare cost prediction process.
Dataset Snapshot
Hospitalization details.xlsx
Dataset Description
Hospitalization details.xlsx
Variables Description
Customer ID Unique identification for beneficiary(primary)
year Year of birth
month Month of birth
date Date of birth
children No. of children as dependents
charges Hospitalization cost
Hospital tier Level of hospital, tier-1 being the best
City tier Level of city per government document, tier-1 referring to the most
developed
State ID ID of the state
Dataset Snapshot
Medical Examinations.xlsx
Dataset Description
Medical Examinations.xlsx
Variables Description
Customer ID Unique identification for beneficiary(primary)
BMI Shows the body mass index of the individual (BMI measures body fat
based on height and weight)
HBA1C Shows the HBA1C report (HBA1C measures the amount of sugar in the
blood (glucose), where HBA1C greater than 6.5 is considered diabetic
Heart Issues Shows if a patient has heart-related issues
Any Transplants Shows if a patient has any transplants in their body
Cancer history Shows if a patient has any history of cancer in the family
NumberOfMajorSurgeries Displays the number of major surgeries a patient has gone through
smoker Indicates if a patient smokes cigarettes
Dataset Snapshot
Names.xlsx
Dataset Description
Names.xlsx
Variables Description
Customer ID Unique identification for beneficiary(primary)
name Name of the beneficiary(primary)
Project Task: Week 1
Data science
Data science
7. Age appears to be a significant factor in this analysis. Calculate the patients' ages based on their
dates of birth.
8. The gender of the patient may be an important factor in determining the cost of hospitalization.
The salutations in a beneficiary's name can be used to determine their gender. Make a new field
for the beneficiary's gender.
9. You should also visualize the distribution of costs using a histogram, box and whisker plot, and
swarm plot.
10. State how the distribution is different across gender and tiers of hospitals
11. Create a radar chart to showcase the median hospitalization cost for each tier of hospitals
12. Create a frequency table and a stacked bar chart to visualize the count of people in the different
tiers of cities and hospitals
Project Task: Week 1
Data science
Machine learning
Machine learning
3. Case scenario:
Estimate the cost of hospitalization for Christopher, Ms. Jayna (Date of
birth 12/28/1988; height 170 cm; and weight 85 kgs). She lives with her partner and two children
in a tier-1 city, and her state’s State ID is R1011. She was found to be nondiabetic (HbA1c = 5.8).
She smokes but is otherwise healthy. She has had no transplants or major surgeries. Her father
died of lung cancer. Hospitalization costs will be estimated using tier-1 hospitals.
SQL
a. Merge the two tables by first identifying the columns in the data tables that will help you
in merging
b. In both tables, add a Primary Key constraint for these columns
Hint: You can remove duplicates and null values from the column and then use ALTER TABLE to add a
Primary Key constraint.
Project Task: Week 2
SQL
2. Retrieve information about people who are diabetic and have heart problems with their average
age, the average number of dependent children, average BMI, and average hospitalization costs
3. Find the average hospitalization cost for each hospital tier and each city level
4. Determine the number of people who have had major surgery with a history of cancer
5. Determine the number of tier-1 hospitals in each state
Project Task: Week 2
Tableau
1. Create a dashboard in Tableau by selecting the appropriate chart types and business metrics