0% found this document useful (0 votes)
101 views5 pages

Homework M2 - 9-2-22

The document provides data on fasting blood glucose levels for 18 diabetics who completed a diabetes control class. It asks a series of questions about calculating summary statistics and analyzing the data: - The median fasting blood glucose level was 145 mg/dl, while the mean was higher at 158 mg/dl. The interquartile range was 95 mg/dl. - The median is the best measure of central tendency to report since the data is right-skewed. - Standard deviation would be the best measure of dispersion due to the variability in the data. - Based on the data, the success of the diabetes control class was limited since most participants remained above the target range for blood glucose levels.

Uploaded by

Ash Shoaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views5 pages

Homework M2 - 9-2-22

The document provides data on fasting blood glucose levels for 18 diabetics who completed a diabetes control class. It asks a series of questions about calculating summary statistics and analyzing the data: - The median fasting blood glucose level was 145 mg/dl, while the mean was higher at 158 mg/dl. The interquartile range was 95 mg/dl. - The median is the best measure of central tendency to report since the data is right-skewed. - Standard deviation would be the best measure of dispersion due to the variability in the data. - Based on the data, the success of the diabetes control class was limited since most participants remained above the target range for blood glucose levels.

Uploaded by

Ash Shoaib
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

PHPH 7017 Homework M2

1. People with diabetes must monitor and control their blood glucose level. The goal is to maintain
“fasting plasma glucose” between 90 and 130 milligrams per deciliter (mg/dl). Here are the
fasting plasma glucose levels for 18 diabetics enrolled in a diabetes control class, 5 months after
the class has ended.

141 158 112 153 134 95


96 78 148 172 200 271
103 359 145 147 172 255

a) Calculate the median, mean, geometric mean, mange, variance, standard deviation, and
interquartile range (IQR) of fasting blood glucose levels.

Note: Sometimes when you calculate a percentile location you may get something like 5.5,
which we know means the median lies ½ way between the 5 th and 6th value in the ordered
dataset- so we can just average the 5 th and 6th values to get median. But for 1st and 3rd
quartiles you get a value like 5.75 it means you have to find the number that is ¾ between
the 5th and 6th observation. To do this find the difference between the 5 th and 6th observation
in the ordered dataset. Multiply the difference by 0.75 and then add this amount to the 5 th
observation in the ordered dataset.

We did not see an example of this in the kectture but you will in this problem.
As an Example assume we have data: 34 78 79 81
3rd quartile location (75th percentile) is 0.75(4+1) = 3.75
So the 3rd and 4th observation is 79 and 81
Calculate: (81-79)*0.75 = 1.5
So 3rd quartile, Q3 = 79+1.5 = 80.5

b) Which measure of central tendency would you select to report for these data and why?

c) Which measure of dispersion would you select to report for these data and why?

d) What is your conclusion on the diabetic control class’s success based on these data?

e) Sketch the boxplot of these data


2. Suppose that you and your friends emptied the coins in your pockets, wallets, etc recorded the
year marked on each coin. What do you think the shape of this distribution would look like—
skewed left, skewed right, or symmetric? Explain.

3. Use the following 7 numbers:

1 10 100 1000 10000 100000 1000000

a) Calculate the geometric mean by multiplying the 7 numbers together and taking the
7th root.
b) Use log base 10 and convert all these numbers to the log scale. List the converted
log base 10 values.
c) Using the results from part b, find the geometric mean by averaging all the
converted log base 10 numbers and taking the antilog base 10. The answer should
be the same as part a.
d) Instead of using log base 10, use the natural log to compute the geometric mean.
Show all steps.
This answer should be EXACTLY the same as part a. If it is not, it should be very very
close, and it is due to the rounding that it did not turn out to be exactly the same.
4. Compare the distributions of household size for South Africa and the United Kingdom (UK)
based on a sample of survey respondents.

a) Which country has more dispersion in household size? What might this tell you about each
country?
b) While you do not have the data, if you were to calculate the mean household size for each
country, which country do you expect has a larger mean?
c) For South Africa, which measure of central tendency do you expect to be the better statistic
to summarize the typical value – MEAN or MEDIAN?
5. Use the following results from the DIETFITS randomized control trial to answer the questions below.

a) Which group had the largest median weight loss over 12-months? (Note: weight loss is shown as
a negative 12-mo weight change in boxplots)
b) Which group has the smallest interquartile range of weight change over 12-months?
c) Which group has the person who had the most weight loss over 12-months?
d) Does this graph (and data) provide any compelling evidence that the type of diet matters or
whether a person’s genotype matters for weight loss over 12 months? Explain.

6. In the following dataset of 10 people, I assign a 1 if you have hepatitis A and 0 if you do not have
hepatitis A.

1 0 0 0 1 0 1 0 0 0

a) What proportion of people have hepatitis A? What proportion do NOT have hepatitis A?

b) Use the sample mean formula and calculate the mean of these data. Notice that the sample
mean formula of 0-1 data gives you the proportion.
7. The Framingham Heart Study is a long term prospective study of the etiology of cardiovascular
disease among a population of subjects in the community of Framingham, Massachusetts. The
Framingham Heart Study was a landmark study in epidemiology in that it was the first prospective
study of cardiovascular disease and identified the concept of risk factors and their joint effects. The
study began in 1948. 5,209 subjects were initially enrolled in the study. Participants have been
examined biennially since the inception of the study and all subjects are continuously followed
through regular surveillance for cardiovascular outcomes.

The following histogram shows the distribution of age at examination.

The sample mean is 54.79 years and the sample standard deviation is 9.56

a) What range of ages contains approximately 95% of the subjects?

b) What age is 2.8 standard deviations below the mean?

c) How many standard deviations above the mean is 81.56 years?

d) What age is 1 standard deviations above the mean?

e) How many standard deviations below the mean is 43.52 years?

You might also like