0% found this document useful (0 votes)
17 views2 pages

Case Study 1

The document outlines a case study for an MBA course focused on data wrangling and summary statistics in a health consultancy context. It includes tasks such as calculating BMI, classifying BMI, addressing potential selection issues with missing data, and generating summary statistics for various patient demographics. The goal is to analyze factors influencing cardiovascular disease risk among patients.

Uploaded by

Shrey Watal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

Case Study 1

The document outlines a case study for an MBA course focused on data wrangling and summary statistics in a health consultancy context. It includes tasks such as calculating BMI, classifying BMI, addressing potential selection issues with missing data, and generating summary statistics for various patient demographics. The goal is to analyze factors influencing cardiovascular disease risk among patients.

Uploaded by

Shrey Watal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

MBA 640: Case Study 1

Data Wrangling and Summary Statistics

You work at a health consultancy firm which is tasked with identifying high-risk
patients for cardiovascular disease. From the medical literature, cardiovascular
disease is influenced by patient’s age; body mass index (BMI); amount of exercise;
race, and education. The dataset contains patients’ race; body weight (in
kilograms); height (in meters); date of birth; number of minutes of exercise;
education level. Cells with NA indicate missing data (i.e., no answer was given)

1. Create a new variable called “BMI” that contains the body mass index of
patients. [6 points]
∈kg
Note 1: BMI is calculated as weight
¿¿

Note 2: When calculating BMI use mean imputation for missing data.

2. Create a variable called “BMI Classification” labelling observations as


1 if BMI<18.5 (underweight)
2 if 18.5≤BMI≤24.9 (healthy weight)
3 if 25.0≤BMI≤29.9 (overweight)
4 if BMI≥30 (obese)

[5 points]

3. You are worried about selection issues with missing data and in particular
that women are more likely to not report their weight. Is this issue likely to
be a concern in your data? Justify your answer. [6 points]

4. Calculate patients age as of April 1st, 2025 [2 points]

5. Create a variable called “Years of Education”, where observations are


coded as having:
12 years of education if education=HS
14 years of education if education=Some College
16 years of education if education=College
18 years of education if education=Graduate

1
[5 points]

6. Create a variable called “White” that =1 if race is Non-Hispanic White and


=0 if otherwise. [3 points]

7. Produce a table of summary statistics showing the mean and standard


deviation for the following variables: BMI, Age, Exercise, Years of Education,
and White. [4 points]

8. Produce a table showing obesity by education level. [4 points]

10. Produce a table showing obesity by education level for (a) non-Hispanic
White individuals and (b) non-Hispanic Black individuals. [6 points]

You might also like